import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
%matplotlib inline
# Library needed for performing statistical tests
from scipy import stats
# Needed for normalisation
from sklearn import preprocessing
from sklearn import cluster, metrics
from sklearn.decomposition import PCA
# Libraries needed for training the clustering algorithms
from sklearn.cluster import KMeans
from sklearn.mixture import GaussianMixture as GMM
from sklearn.cluster import DBSCAN
# Import the silhouette_score function from sklearn.metrics
from sklearn.metrics import silhouette_score
# Library needed for social network graph
import networkx as nx
import warnings
warnings.filterwarnings('ignore')
The metro bike dataset includes anonymized bike travel data from the Los Angeles Metro Bike Share (Source: https://bikeshare.metro.net/about/data/).
The rows correspond to a single bike trip, the columns are below:
| Column | Description |
|---|---|
| trip_id | Locally unique integer that identifies the trip. |
| duration | Length of trip in minutes. |
| start_time | The date/time when the trip began, presented in ISO 8601 format in local time. |
| end_time | The date/time when the trip ended, presented in ISO 8601 format in local time. |
| start_station | The station ID where the trip originated (for station name and more information on each station see the Station Table). |
| start_lat | The latitude of the station where the trip originated. |
| start_lon | The longitude of the station where the trip originated. |
| end_station | The station ID where the trip terminated (for station name and more information on each station see the Station Table). |
| end_lat | The latitude of the station where the trip terminated. |
| end_lon | The longitude of the station where the trip terminated. |
| bike_id | Locally unique integer that identifies the bike. |
| plan_duration | The number of days that the plan the passholder is using entitles them to ride; 0 is used for a single ride plan (Walk-up). |
| trip_route_category | "Round Trip" for trips starting and ending at the same station or "One Way" for all other trips. |
| passholder_type | The name of the passholder's plan. |
| bike_type | The kind of bike used on the trip, including standard pedalpowered bikes, electric assist bikes, or smart bikes. |
# Load `metro.csv` into a dataframe `metro_bike` and displaying the first five rows.
metro_bike = pd.read_csv('metro.csv')
metro_bike.head()
| trip_id | duration | start_time | end_time | start_station | start_lat | start_lon | end_station | end_lat | end_lon | bike_id | plan_duration | trip_route_category | passholder_type | bike_type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 124657107 | 5 | 7/1/2019 0:04 | 7/1/2019 0:09 | 4312 | 34.066990 | -118.290878 | 4410 | 34.063351 | -118.296799 | 6168 | 30 | One Way | Monthly Pass | standard |
| 1 | 124657587 | 9 | 7/1/2019 0:07 | 7/1/2019 0:16 | 3066 | 34.063389 | -118.236160 | 3066 | 34.063389 | -118.236160 | 17584 | 30 | Round Trip | Monthly Pass | electric |
| 2 | 124658068 | 5 | 7/1/2019 0:20 | 7/1/2019 0:25 | 4410 | 34.063351 | -118.296799 | 4312 | 34.066990 | -118.290878 | 18920 | 30 | One Way | Monthly Pass | electric |
| 3 | 124659747 | 20 | 7/1/2019 0:44 | 7/1/2019 1:04 | 3045 | 34.028511 | -118.256668 | 4275 | 34.012520 | -118.285896 | 6016 | 1 | One Way | Walk-up | standard |
| 4 | 124660227 | 27 | 7/1/2019 0:44 | 7/1/2019 1:11 | 3035 | 34.048401 | -118.260948 | 3049 | 34.056969 | -118.253593 | 5867 | 30 | One Way | Monthly Pass | standard |
Looking at the data above, we can see that the start_time and end_time aren't in the ISO format.
# use `.describe()` to get numerical columns
metro_bike.describe()
| trip_id | duration | start_station | start_lat | start_lon | end_station | end_lat | end_lon | plan_duration | |
|---|---|---|---|---|---|---|---|---|---|
| count | 9.212400e+04 | 92124.000000 | 92124.000000 | 89985.000000 | 89985.000000 | 92124.000000 | 88052.000000 | 88052.000000 | 92124.000000 |
| mean | 1.274286e+08 | 33.168588 | 3484.899690 | 34.034786 | -118.287893 | 3480.271026 | 34.034895 | -118.286699 | 60.290977 |
| std | 1.524134e+06 | 129.057841 | 611.483883 | 0.058803 | 0.073501 | 609.942741 | 0.058790 | 0.072628 | 111.141364 |
| min | 1.246571e+08 | 1.000000 | 3000.000000 | 33.710979 | -118.495422 | 3000.000000 | 33.710979 | -118.495422 | 1.000000 |
| 25% | 1.261375e+08 | 6.000000 | 3029.000000 | 34.035801 | -118.281181 | 3028.000000 | 34.037048 | -118.280952 | 1.000000 |
| 50% | 1.274911e+08 | 12.000000 | 3062.000000 | 34.046810 | -118.258537 | 3062.000000 | 34.046810 | -118.258537 | 30.000000 |
| 75% | 1.287379e+08 | 22.000000 | 4285.000000 | 34.051941 | -118.248253 | 4285.000000 | 34.051941 | -118.248253 | 30.000000 |
| max | 1.303877e+08 | 1440.000000 | 4453.000000 | 34.177662 | -118.231277 | 4453.000000 | 34.177662 | -118.231277 | 999.000000 |
A brief glance at this description tells us that the column plan_duration has a maximum value of 999. This seems like an outlier and is flagged.
The duration column also seems to contain outliers as the maximum trip duration is 1440 which is 24 hours. The source mentions trip durations can last 24 hours, so it's left as is.
metro_bike.shape
(92124, 15)
We can see that the dataset is large enough for any preprocessing that involves removal of rows.
metro_bike.dtypes
trip_id int64 duration int64 start_time object end_time object start_station int64 start_lat float64 start_lon float64 end_station int64 end_lat float64 end_lon float64 bike_id object plan_duration int64 trip_route_category object passholder_type object bike_type object dtype: object
.dtypes tells the datatype of the start_time and end_time is object. For any date/time data, it must be a datetime datatype.
# Check for null/missing values to handle in data cleaning
metro_bike.isnull().sum()
trip_id 0 duration 0 start_time 0 end_time 0 start_station 0 start_lat 2139 start_lon 2139 end_station 0 end_lat 4072 end_lon 4072 bike_id 0 plan_duration 0 trip_route_category 0 passholder_type 0 bike_type 0 dtype: int64
Perform data cleaning next based on the information gleaned from this section.
# Since `trip_id` is unique for all the values, we can set it to be the index.
metro_bike.set_index('trip_id', inplace=True)
metro_bike.head()
| duration | start_time | end_time | start_station | start_lat | start_lon | end_station | end_lat | end_lon | bike_id | plan_duration | trip_route_category | passholder_type | bike_type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| trip_id | ||||||||||||||
| 124657107 | 5 | 7/1/2019 0:04 | 7/1/2019 0:09 | 4312 | 34.066990 | -118.290878 | 4410 | 34.063351 | -118.296799 | 6168 | 30 | One Way | Monthly Pass | standard |
| 124657587 | 9 | 7/1/2019 0:07 | 7/1/2019 0:16 | 3066 | 34.063389 | -118.236160 | 3066 | 34.063389 | -118.236160 | 17584 | 30 | Round Trip | Monthly Pass | electric |
| 124658068 | 5 | 7/1/2019 0:20 | 7/1/2019 0:25 | 4410 | 34.063351 | -118.296799 | 4312 | 34.066990 | -118.290878 | 18920 | 30 | One Way | Monthly Pass | electric |
| 124659747 | 20 | 7/1/2019 0:44 | 7/1/2019 1:04 | 3045 | 34.028511 | -118.256668 | 4275 | 34.012520 | -118.285896 | 6016 | 1 | One Way | Walk-up | standard |
| 124660227 | 27 | 7/1/2019 0:44 | 7/1/2019 1:11 | 3035 | 34.048401 | -118.260948 | 3049 | 34.056969 | -118.253593 | 5867 | 30 | One Way | Monthly Pass | standard |
We perform data cleaning in this subsection to remove any missing values, by using .dropna
# `inplace` is added as a parameter to modify the current dataframe itself.
metro_bike.dropna(inplace=True)
metro_bike.isnull().sum()
duration 0 start_time 0 end_time 0 start_station 0 start_lat 0 start_lon 0 end_station 0 end_lat 0 end_lon 0 bike_id 0 plan_duration 0 trip_route_category 0 passholder_type 0 bike_type 0 dtype: int64
metro_bike.shape
(86760, 14)
The shape of the dataset has changed, the rows have been reduced to 86760. Clearly, there were overlaps present in the missing values.
# `.info` is used to find a brief overview of the non null rows and data type.
metro_bike.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 86760 entries, 124657107 to 130053088 Data columns (total 14 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 duration 86760 non-null int64 1 start_time 86760 non-null object 2 end_time 86760 non-null object 3 start_station 86760 non-null int64 4 start_lat 86760 non-null float64 5 start_lon 86760 non-null float64 6 end_station 86760 non-null int64 7 end_lat 86760 non-null float64 8 end_lon 86760 non-null float64 9 bike_id 86760 non-null object 10 plan_duration 86760 non-null int64 11 trip_route_category 86760 non-null object 12 passholder_type 86760 non-null object 13 bike_type 86760 non-null object dtypes: float64(4), int64(4), object(6) memory usage: 9.9+ MB
metro_bike.duration.hist(bins=50, range=(1,45))
plt.xlabel("Minutes")
plt.ylabel("Frequency")
Text(0, 0.5, 'Frequency')
The histogram gives us a distribution of the duration of a single bike journey, ranging from 1 minute to 45 minutes.
# Use .mean() and .median() for mean and median values
metro_bike.duration.mean(), metro_bike.duration.median()
(26.99640387275242, 11.0)
metro_bike.hist(figsize=(10, 10))
array([[<AxesSubplot:title={'center':'duration'}>,
<AxesSubplot:title={'center':'start_station'}>,
<AxesSubplot:title={'center':'start_lat'}>],
[<AxesSubplot:title={'center':'start_lon'}>,
<AxesSubplot:title={'center':'end_station'}>,
<AxesSubplot:title={'center':'end_lat'}>],
[<AxesSubplot:title={'center':'end_lon'}>,
<AxesSubplot:title={'center':'plan_duration'}>, <AxesSubplot:>]],
dtype=object)
Multiple columns of the dataset (numerical values) are displayed in the histogram above.
We create a new dataframe where we filter the values to display durations less than 6 hours and plot the new histogram.
metro_bike_new = metro_bike[metro_bike.duration < 360]
metro_bike_new.shape
(85974, 14)
metro_bike_new.duration.hist(bins=50, range=(1,90))
plt.xlabel("Minutes")
plt.ylabel("Frequency")
Text(0, 0.5, 'Frequency')
.value_counts() is the Python function to give a count of each unique value in the selected column.
metro_bike['plan_duration'].value_counts()
30 55907 1 21451 365 9375 999 27 Name: plan_duration, dtype: int64
The plan_duration explores the number of days a user can rent bikes and we plot it below using a Seaborn barplot.
p_d = sns.countplot(x='plan_duration', data=metro_bike)
for p in p_d.patches:
p_d.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
p_d.set(xlabel='Plan Duration', ylabel='Count')
p_d.set_title('Count of Plan Duration')
Text(0.5, 1.0, 'Count of Plan Duration')
The count of trip_route_category explores the number of one-way and round trip journeys undertaken by riders and plot it.
metro_bike['trip_route_category'].value_counts()
One Way 72029 Round Trip 14731 Name: trip_route_category, dtype: int64
sns.set(rc={'figure.figsize':(11,9)})
trip_cat = sns.countplot(x='trip_route_category', data=metro_bike)
for p in trip_cat.patches:
trip_cat.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
trip_cat.set(xlabel='Trip Route', ylabel='Count')
trip_cat.set_title('Count of various Trip Categories')
Text(0.5, 1.0, 'Count of various Trip Categories')
We look at the various passes available for purchase to users.
metro_bike['passholder_type'].value_counts()
Monthly Pass 55904 Walk-up 21258 Annual Pass 5966 One Day Pass 3599 Testing 27 Flex Pass 6 Name: passholder_type, dtype: int64
An interesting observation is how the Testing pass has the same value count as the count of 999 in the plan_duration column. An assumption is when a user wants to test out the service, LA Metro Bike gives a pass whose plan is valid for 999 days to not cause disruption to the other data available.
sns.set(rc={'figure.figsize':(11,7)})
pass_type = sns.countplot(x='passholder_type', data=metro_bike)
for p in pass_type.patches:
pass_type.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
pass_type.set(xlabel='Types of Passes', ylabel='Count')
pass_type.set_title('Count of various Passes Available')
Text(0.5, 1.0, 'Count of various Passes Available')
The count of Flex and Testing passes are low, hence they don't show up on the plot.
metro_bike['bike_type'].value_counts()
electric 45818 standard 28966 smart 11976 Name: bike_type, dtype: int64
Exploring count of various bike types, we see electric bikes are chosen by most users for their travel needs and we plot the same.
sns.set(rc={'figure.figsize':(11,8)})
bike_type = sns.countplot(x='bike_type', data=metro_bike)
for p in bike_type.patches:
bike_type.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
bike_type.set(xlabel='Types of Bikes', ylabel='Count')
bike_type.set_title('Count of Various Bikes Available')
Text(0.5, 1.0, 'Count of Various Bikes Available')
A distplot is a histogram of the entire dataset and it is used to produce a plot and fit it to a probability density function.
pdf = sns.distplot(metro_bike['plan_duration'])
pdf.set(xlabel='Plan Duration', ylabel='Density')
pdf.set_title('Probability Density Function of Plan Duration')
plt.show()
Looking at the hisotgram above, we can see a curve which represents the probability density function based on plan_duration
pd.to_datetime is used on the start_time and end_time columns to convert the columns to a datetime datatype and ISO format. Time series data is always in the datetime datatype.
metro_bike['start_time'] = pd.to_datetime(metro_bike['start_time'])
metro_bike['end_time'] = pd.to_datetime(metro_bike['end_time'])
metro_bike.head()
| duration | start_time | end_time | start_station | start_lat | start_lon | end_station | end_lat | end_lon | bike_id | plan_duration | trip_route_category | passholder_type | bike_type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| trip_id | ||||||||||||||
| 124657107 | 5 | 2019-07-01 00:04:00 | 2019-07-01 00:09:00 | 4312 | 34.066990 | -118.290878 | 4410 | 34.063351 | -118.296799 | 6168 | 30 | One Way | Monthly Pass | standard |
| 124657587 | 9 | 2019-07-01 00:07:00 | 2019-07-01 00:16:00 | 3066 | 34.063389 | -118.236160 | 3066 | 34.063389 | -118.236160 | 17584 | 30 | Round Trip | Monthly Pass | electric |
| 124658068 | 5 | 2019-07-01 00:20:00 | 2019-07-01 00:25:00 | 4410 | 34.063351 | -118.296799 | 4312 | 34.066990 | -118.290878 | 18920 | 30 | One Way | Monthly Pass | electric |
| 124659747 | 20 | 2019-07-01 00:44:00 | 2019-07-01 01:04:00 | 3045 | 34.028511 | -118.256668 | 4275 | 34.012520 | -118.285896 | 6016 | 1 | One Way | Walk-up | standard |
| 124660227 | 27 | 2019-07-01 00:44:00 | 2019-07-01 01:11:00 | 3035 | 34.048401 | -118.260948 | 3049 | 34.056969 | -118.253593 | 5867 | 30 | One Way | Monthly Pass | standard |
metro_bike.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 86760 entries, 124657107 to 130053088 Data columns (total 14 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 duration 86760 non-null int64 1 start_time 86760 non-null datetime64[ns] 2 end_time 86760 non-null datetime64[ns] 3 start_station 86760 non-null int64 4 start_lat 86760 non-null float64 5 start_lon 86760 non-null float64 6 end_station 86760 non-null int64 7 end_lat 86760 non-null float64 8 end_lon 86760 non-null float64 9 bike_id 86760 non-null object 10 plan_duration 86760 non-null int64 11 trip_route_category 86760 non-null object 12 passholder_type 86760 non-null object 13 bike_type 86760 non-null object dtypes: datetime64[ns](2), float64(4), int64(4), object(4) memory usage: 11.9+ MB
To perform aggregations on duration, we must split the start_time column into separate columns.
metro_bike['start_day'] = metro_bike['start_time'].dt.dayofweek
days_of_week = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
metro_bike['start_month'] = metro_bike['start_time'].dt.month_name()
metro_bike['start_hour'] = metro_bike['start_time'].dt.hour
metro_bike['start_date'] = metro_bike['start_time'].dt.date
metro_bike['starting_time'] = metro_bike['start_time'].dt.time
metro_bike['end_hour'] = metro_bike['end_time'].dt.hour
metro_bike.head()
| duration | start_time | end_time | start_station | start_lat | start_lon | end_station | end_lat | end_lon | bike_id | plan_duration | trip_route_category | passholder_type | bike_type | start_day | start_month | start_hour | start_date | starting_time | end_hour | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| trip_id | ||||||||||||||||||||
| 124657107 | 5 | 2019-07-01 00:04:00 | 2019-07-01 00:09:00 | 4312 | 34.066990 | -118.290878 | 4410 | 34.063351 | -118.296799 | 6168 | 30 | One Way | Monthly Pass | standard | 0 | July | 0 | 2019-07-01 | 00:04:00 | 0 |
| 124657587 | 9 | 2019-07-01 00:07:00 | 2019-07-01 00:16:00 | 3066 | 34.063389 | -118.236160 | 3066 | 34.063389 | -118.236160 | 17584 | 30 | Round Trip | Monthly Pass | electric | 0 | July | 0 | 2019-07-01 | 00:07:00 | 0 |
| 124658068 | 5 | 2019-07-01 00:20:00 | 2019-07-01 00:25:00 | 4410 | 34.063351 | -118.296799 | 4312 | 34.066990 | -118.290878 | 18920 | 30 | One Way | Monthly Pass | electric | 0 | July | 0 | 2019-07-01 | 00:20:00 | 0 |
| 124659747 | 20 | 2019-07-01 00:44:00 | 2019-07-01 01:04:00 | 3045 | 34.028511 | -118.256668 | 4275 | 34.012520 | -118.285896 | 6016 | 1 | One Way | Walk-up | standard | 0 | July | 0 | 2019-07-01 | 00:44:00 | 1 |
| 124660227 | 27 | 2019-07-01 00:44:00 | 2019-07-01 01:11:00 | 3035 | 34.048401 | -118.260948 | 3049 | 34.056969 | -118.253593 | 5867 | 30 | One Way | Monthly Pass | standard | 0 | July | 0 | 2019-07-01 | 00:44:00 | 1 |
To begin with, we start by plotting the daily count of the bikes rented and we find most are rented on Tuesday with Monday coming in next.
metro_bike.groupby('start_day').count()['bike_type'].plot(figsize=(12,8))
plt.xlabel("Days of the Week")
plt.ylabel("Frequency of the Bikes rented")
plt.title("Daily Bike Frequency")
plt.show()
Similarly, we plot the bike rentals against the day of the week and the above observation holds true.
count_day = sns.countplot(x='start_day', data=metro_bike)
for p in count_day.patches:
count_day.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
count_day.set(xlabel='Day of the Week', ylabel='Count')
count_day.set_title('Count of Bikes based on Days')
Text(0.5, 1.0, 'Count of Bikes based on Days')
# Plot bike ride for each hour of the day for the entire week
plt.figure(figsize=(24,8))
# Plot the countplot with a legend to the side
sns.countplot(x='start_day', hue='start_hour', data=metro_bike, palette='terrain')
plt.legend(loc='center left', bbox_to_anchor=(1, 0.5), ncol=2)
<matplotlib.legend.Legend at 0x7f98569b30d0>
# Create a sunburst plot using the 'start_hour' and 'start_station' columns
fig = px.sunburst(metro_bike, path=['start_hour', 'start_station'], color='start_hour')
fig.update_layout(title='Start Hour vs Start Station')
fig.show()
# Create a sunburst plot using the 'end_hour' and 'end_station' columns
fig = px.sunburst(metro_bike, path=['end_hour', 'end_station'], color='end_station')
fig.update_layout(title='End hour vs End Station')
fig.show()
We can see that the start_station and end_station is usually centered around 3030 and 3014, both being at the same hour, 16th and 17th.
We find most bikes are rented on Tuesday at the 17th hour.
We can also focus on finding how long the trip duration lasts based on the starting hour of each day of the week. This is done by using the extracted columns start_hour and start_day and plotting them against duration.
# Group the data by hour and day of the week and compute the mean duration
mean_duration = metro_bike.groupby(['start_hour', 'start_day'])['duration'].mean().reset_index()
# Loop over the days of the week
for day in range(7):
# Select the data for the current day
data = mean_duration[mean_duration['start_day'] == day]
# Create a barplot
duration_plot = sns.barplot(x='start_hour', y='duration', data=data, palette='RdYlBu')
duration_plot.set(xlabel='Start Hour', ylabel='Duration')
# Set the title of the plot
duration_plot.set_title(days_of_week[day])
# Show the plot
plt.show()
The above plots tell us that the starting hour of renting a bike changes based on each day, constant for the first few days. Sunday has the most parity in mean duration of the trips.
When it comes to weekdays, the users generally take the bikes for longer duration.
We can split the plots into two - one plot of the mean duration vs starting_hour and another of the mean duration vs day of the week
mean_duration1 = metro_bike.groupby(['start_hour'])['duration'].mean().reset_index()
dur_hour = sns.lineplot(x='start_hour', y='duration', data=mean_duration1)
for line in range(0, mean_duration1.shape[0]):
dur_hour.text(mean_duration1.start_hour[line], mean_duration1.duration[line], round(mean_duration1.duration[line], 1),
horizontalalignment='left', size='medium', color='grey', weight='semibold')
dur_hour.set(xlabel='Start Hour', ylabel='Mean Duration')
dur_hour.set_title('Mean Duration vs Starting Hour')
Text(0.5, 1.0, 'Mean Duration vs Starting Hour')
Lineplot gives a clearer idea of the mean duration of every bike trip based on the hour the trip began.
We can determine the mean duration of the trip is almost 90 minutes when the starting_hour value is 3 or the 3rd hour after midnight. The least duration seems to be right after the day usually begins around the 7th hour.
mean_duration2 = metro_bike.groupby(['start_day'])['duration'].mean().reset_index()
dur_day = sns.lineplot(x='start_day', y='duration', data=mean_duration2)
for line in range(0, mean_duration2.shape[0]):
dur_day.text(mean_duration2.start_day[line], mean_duration2.duration[line], round(mean_duration2.duration[line], 1),
horizontalalignment='left', size='medium', color='green', weight='semibold')
dur_day.set(xlabel='Day of the Week', ylabel='Mean Duration')
dur_day.set_title('Mean Duration vs Day of the Week')
Text(0.5, 1.0, 'Mean Duration vs Day of the Week')
The above plot tells us that day 6 or Sunday is the day with the longest mean duration. This can correlate with the fact that weekend is when most riders will have more free time to spend a leisure day. On the other hand, Friday sees the least mean durations.
metro_bike['start_month'].value_counts()
August 30876 September 28716 July 27168 Name: start_month, dtype: int64
August had the most bikes rented and the plot confirms it.
sns.set(rc={'figure.figsize':(11,8)})
month_count = sns.countplot(x='start_month', data=metro_bike)
for p in month_count.patches:
month_count.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
month_count.set(xlabel='Month', ylabel='Count')
month_count.set_title('Count of Bikes rented based on Month')
Text(0.5, 1.0, 'Count of Bikes rented based on Month')
Using a pie chart to visualise the above against duration gives a better idea of the percentage of users.
labels = ['August','September','July']
plt.pie(x=metro_bike.groupby('start_month').count()['duration'].sort_values(ascending=False),
autopct='%1.1f%%',labels = labels)
plt.title("Percentile Distribution per Month")
plt.show()
Count of the bikes rented plotted against the start_hour gives the result that the 17th hour of the day is the busiest for bike rentals, correlating with end of the workday.
hour_count = sns.countplot(x='start_hour', data=metro_bike)
for p in hour_count.patches:
hour_count.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
hour_count.set(xlabel='Hour', ylabel='Count')
hour_count.set_title('Count of Bikes rented based on Hour of the Day')
Text(0.5, 1.0, 'Count of Bikes rented based on Hour of the Day')
# Create a cross-tabulation of the 'passholder_type' and 'bike_type' columns
pass_bike = pd.crosstab(metro_bike['passholder_type'], metro_bike['bike_type'])
# Plot the cross-tabulation as a bar plot
ax = pass_bike.plot(kind='bar', figsize=(15, 15))
for p in ax.patches:
ax.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
plt.xlabel('Passholder Type')
plt.ylabel('Count')
plt.title('Passholder Type over various Bike Types')
Text(0.5, 1.0, 'Passholder Type over various Bike Types')
The various passholder_types are plotted against bike_types to get an idea of the kind of bikes rented by the users.
The duration of the various bike types mentioned in bike_type column are plotted to find the mean duration of each. We find smart bikes have a longer mean duration, the values of the others are almost half of smart bikes.
mean_duration3 = metro_bike.groupby('bike_type')['duration'].mean().reset_index()
dur_bike = sns.barplot(x='bike_type', y='duration', data=mean_duration3, color='red', palette='pastel')
for p in dur_bike.patches:
dur_bike.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
dur_bike.set(xlabel='Bike Type', ylabel='Duration')
dur_bike.set_title('Mean Duration of Various Bike Types')
Text(0.5, 1.0, 'Mean Duration of Various Bike Types')
A violinplot and swarmplot were also attempted for the above aggregation but barplot is better for comparing means of different groups and was preferred. Violinplot is useful for distribution of data and swarmplot is another form of scatterplot for individual data points.
We perform t-test on the passholder_type and mean duration of each. The result displays the p-value of each passholder_type against the other to ensure there is parity in the result to find which passholders are significantly different.
metro_by_passholder = metro_bike.groupby("passholder_type")
duration_by_passholder = metro_by_passholder["duration"]
mean_duration_by_passholder = duration_by_passholder.mean()
# Create a figure with multiple subplots
fig, axs = plt.subplots(nrows=len(mean_duration_by_passholder.index)-1, ncols=1,
figsize=(20, 6*len(mean_duration_by_passholder.index)))
# Loop through each pair of passholder types and plot the t-statistic values
for i, passholder_type1 in enumerate(mean_duration_by_passholder.index[:-1]):
for j, passholder_type2 in enumerate(mean_duration_by_passholder.index[i+1:]):
t_stat, p_value = stats.ttest_ind(duration_by_passholder.get_group(passholder_type1),
duration_by_passholder.get_group(passholder_type2))
print(f'{passholder_type1} vs. {passholder_type2}: p-value = {p_value}')
axs[i].bar(f"{passholder_type1} vs. {passholder_type2}", p_value)
axs[i].set_title(f"p-value for {passholder_type1}")
for p in axs[i].patches:
axs[i].text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
axs[i].set_xlabel("Passholder types")
axs[i].set_ylabel("p-value")
plt.show()
Annual Pass vs. Flex Pass: p-value = 0.8158212864252368 Annual Pass vs. Monthly Pass: p-value = 0.0031987234897233076 Annual Pass vs. One Day Pass: p-value = 8.288939500446688e-155 Annual Pass vs. Testing: p-value = 0.2395399041543238 Annual Pass vs. Walk-up: p-value = 7.970594234589781e-87 Flex Pass vs. Monthly Pass: p-value = 0.6891909679056905 Flex Pass vs. One Day Pass: p-value = 0.30731294748527543 Flex Pass vs. Testing: p-value = 0.4103221827358773 Flex Pass vs. Walk-up: p-value = 0.46382909975284303 Monthly Pass vs. One Day Pass: p-value = 0.0 Monthly Pass vs. Testing: p-value = 0.19523283057602708 Monthly Pass vs. Walk-up: p-value = 0.0 One Day Pass vs. Testing: p-value = 0.11446433041037984 One Day Pass vs. Walk-up: p-value = 3.814834116252384e-19 Testing vs. Walk-up: p-value = 0.3668698793884352
The above code is used to perform t-test on every passholder_type and the mean duration of each passholder. It calculates the t-statistic and p-value for the t-test that is used to determine if the mean duration differs significantly. According to the test hypothesis, if the p-value is less than 0.05, there is a significant difference in the mean duration between two passholder types and the null hypothesis is rejected.
Looking at the above graphs and the p-values, we can determine that the Annual Pass siginificantly differs from most other passes and the null hypothesis is rejected.
We can also reject the null hypothesis for One Day Pass vs Walk-up pass as the p-value is lesser than 0.05
The seeds dataset has been sourced from the UCI Machine Learning repository. (Source: https://archive.ics.uci.edu/ml/datasets/seeds)
The dataset contains the metrics of seeds from several different plant species where each row is understood to be a single seed's measurement details.
| Column | Description |
|---|---|
| area | A, the area of the seed. |
| perimter | P, the length of the perimeter of the seed. |
| compactness | A measure of the area of the seed relative to the perimeter,(4πA/P2) |
| length | The length of the seed. |
| width | The width of the seed. |
| asymmetry | A measure of the asymmetry of the seed. |
| groove_length | The length of the groove in the seed. |
# Load the dataset `seeds.csv` into a dataframe `seeds`
seeds = pd.read_csv('seeds.csv')
seeds.head()
| area | perimeter | compactness | length | width | asymmetry | groove_length | |
|---|---|---|---|---|---|---|---|
| 0 | 15.26 | 14.84 | 0.871 | 5.763 | 3.312 | 2.221 | 5.220 |
| 1 | 14.88 | 14.57 | 0.881 | 5.554 | 3.333 | 1.018 | 4.956 |
| 2 | 14.29 | 14.09 | 0.905 | 5.291 | 3.337 | 2.699 | 4.825 |
| 3 | 13.84 | 13.94 | 0.895 | 5.324 | 3.379 | 2.259 | 4.805 |
| 4 | 16.14 | 14.99 | 0.903 | 5.658 | 3.562 | 1.355 | 5.175 |
The next few cells explore understanding datatypes, statistics and null/missing values, along with creating visualisations to understand the data.
seeds.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 210 entries, 0 to 209 Data columns (total 7 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 area 210 non-null float64 1 perimeter 210 non-null float64 2 compactness 210 non-null float64 3 length 210 non-null float64 4 width 210 non-null float64 5 asymmetry 210 non-null float64 6 groove_length 210 non-null float64 dtypes: float64(7) memory usage: 11.6 KB
seeds.describe()
| area | perimeter | compactness | length | width | asymmetry | groove_length | |
|---|---|---|---|---|---|---|---|
| count | 210.000000 | 210.000000 | 210.000000 | 210.000000 | 210.000000 | 210.000000 | 210.000000 |
| mean | 14.847524 | 14.559286 | 0.871000 | 5.628533 | 3.258605 | 3.700200 | 5.408071 |
| std | 2.909699 | 1.305959 | 0.023594 | 0.443063 | 0.377714 | 1.503559 | 0.491480 |
| min | 10.590000 | 12.410000 | 0.808000 | 4.899000 | 2.630000 | 0.765000 | 4.519000 |
| 25% | 12.270000 | 13.450000 | 0.857250 | 5.262250 | 2.944000 | 2.561500 | 5.045000 |
| 50% | 14.355000 | 14.320000 | 0.873500 | 5.523500 | 3.237000 | 3.599000 | 5.223000 |
| 75% | 17.305000 | 15.715000 | 0.887750 | 5.979750 | 3.561750 | 4.768750 | 5.877000 |
| max | 21.180000 | 17.250000 | 0.918000 | 6.675000 | 4.033000 | 8.456000 | 6.550000 |
seeds.isna().sum()
area 0 perimeter 0 compactness 0 length 0 width 0 asymmetry 0 groove_length 0 dtype: int64
seeds.duplicated().sum()
0
A pairplot is implemented to create a matrix of scatterplots that show the relationship between two variables in a dataset.
pair_plot = sns.pairplot(seeds, hue='perimeter')
pair_plot.fig.suptitle('Plot of Seeds Data', y=1.03)
Text(0.5, 1.03, 'Plot of Seeds Data')
A correlation matrix is utilised when plotting a heatmap to help understand the strength of the relationship between two variables.
corr_matrix = seeds.corr()
sns.set(rc={'figure.figsize':(12,10)})
sns.heatmap(corr_matrix, vmax=1, square=True, cmap='RdYlGn')
<AxesSubplot:>
seeds.shape
(210, 7)
Our intention is to reduce the dimensions to make it easier as clustering algorithms perform better on datasets with fewer dimensions, hence we do PCA.
# Perform PCA to find two principal components before clustering for improving performance of the algorithms
pca = PCA(n_components=2)
seeds_pca = pca.fit_transform(seeds)
seeds_pca_df = pd.DataFrame(seeds_pca, columns=['col1', 'col2'])
pca_plot = sns.pairplot(seeds_pca_df);
pca_plot.fig.suptitle('Plot of Seeds PCA Data', y=1.03)
Text(0.5, 1.03, 'Plot of Seeds PCA Data')
K-means clustering is a method of unsupervised learning that is used to partition a dataset into a specified number of clusters.
To perform k-means clustering, the user specifies the number of clusters to create and provides an initial set of k centroids. The algorithm then iteratively assigns each data point to the cluster with the closest centroid until convergence.
# Create an empty list to store the silhouette scores
silhouette_scores = []
# Loop through the number of clusters to use for k-means
for n in range(2, 8):
# Perform k-means clustering on the PCA data
kmeans = KMeans(n_clusters=n)
seeds_pca_df[f'{n}_clusters'] = kmeans.fit_predict(seeds_pca_df)
# Calculate the silhouette score for the current number of clusters
score = silhouette_score(seeds_pca_df, seeds_pca_df[f'{n}_clusters'])
# Append the silhouette score to the list
silhouette_scores.append(score)
# Create a figure with 3 rows and 3 columns of subplots
fig, ax = plt.subplots(3, 2, sharex=True, sharey=True, figsize=(10, 8))
# Loop through each row and column in the figure
for row in range(3):
for col in range(2):
# Set the color palette for the current subplot
n_clusters = 3*col + row + 2
color_map = plt.cm.get_cmap('inferno', n_clusters)
# Create a scatterplot on the current subplot using the 'PCA1' and 'PCA2' columns as the x and y values, respectively
ax[row, col].scatter(seeds_pca_df['col1'], seeds_pca_df['col2'],
c=seeds_pca_df[f'{n_clusters}_clusters'], cmap=color_map)
# Set the title of the current subplot
ax[row, col].set_title(f"{n_clusters} Clusters\nSilhouette Score = {silhouette_scores[n_clusters-2]:.3f}")
# Add a grid to the current subplot
ax[row, col].grid(True)
# Add a title to the figure
plt.suptitle("KMeans Clustering")
# Tighten the layout of the figure
fig.tight_layout()
The idea behind elbow method is, as the number of clusters increases, the WCSS will initially decrease rapidly as the clusters become more compact. After a point, it will start to decrease slower as the benefits of adding additional clusters diminishes. The elbow point is the value of k at which this transition occurs. For our dataset, the elbow is found to be cluster=3
# Arbitrarily selecting a range of values for K to perfrom elbow method
K = range(1,11)
sum_of_squared_distances = []
# Using Scikit Learn’s KMeans Algorithm to find sum of squared distances
for k in K:
model = KMeans(n_clusters=k).fit(seeds_pca_df)
sum_of_squared_distances.append(model.inertia_)
plt.plot(K, sum_of_squared_distances)
plt.xlabel('K values')
plt.ylabel('Sum of Squared Distances')
plt.title('Elbow Method')
plt.show()
GMM is a probabilistic model that is used to represent the distribution of a dataset as a mixture of multiple Gaussian distributions.
To fit GMM, a specified number of components are in the mixture to initialise the model with a set of parameters. The model is then trained using an iterative optimization procedure until convergence is reached, at which point the model is considered to be trained.
# Initialize lists to store the silhouette scores and WCSS
silhouette_scores = []
for p in range(2, 8):
# Perform GMM clustering on the PCA data
gmm = GMM(n_components=p)
seeds_pca_df[f'{p}_clusters'] = gmm.fit_predict(seeds_pca_df)
# Compute the silhouette score for the current clustering model
silhouette_scores.append(silhouette_score(seeds_pca_df, seeds_pca_df[f'{p}_clusters']))
# Create a figure with 3 rows and 3 columns of subplots
fig, ax = plt.subplots(3, 2, sharex=True, sharey=True, figsize=(10, 8))
# Loop through each row and column in the figure
for row in range(3):
for col in range(2):
# Set the color palette for the current subplot
n_clusters = 3*col + row + 2
color_map = plt.cm.get_cmap('magma', n_clusters)
# Create a scatterplot on the current subplot using the 'PCA1' and 'PCA2' columns as the x and y values, respectively
ax[row, col].scatter(seeds_pca_df['col1'], seeds_pca_df['col2'],
c=seeds_pca_df[f'{n_clusters}_clusters'], cmap=color_map)
# Set the title of the current subplot
ax[row, col].set_title(f"{n_clusters} Clusters\nSilhouette Score = {silhouette_scores[n_clusters-2]:.3f}")
# Add a grid to the current subplot
ax[row, col].grid(True)
# Add a title to the figure
plt.suptitle("GMM Clustering")
# Tighten the layout of the figure
fig.tight_layout()
DBSCAN is an algorithm for clustering data points into clusters based on their density.
To use DBSCAN, the two parameters are eps and min_samples. Eps is the maximum distance between two points considered to be in the same cluster. Min_samples is the minimum number of points required to form a cluster.
DBSCAN cannot cluster the data accurately and hence we decide to skip it.
eps = 1.0
min_samples = 10
# Initialise and fit DBSCAN
db = DBSCAN(eps=eps, min_samples=min_samples).fit(seeds_pca_df)
labels = db.labels_
plt.scatter(seeds_pca_df['col1'], seeds_pca_df['col2'], s=15, c=labels, cmap='jet')
plt.title(f'DBSCAN, eps={eps}, min_samples={min_samples}, n_clusters={max(labels)+2}')
plt.gca().set_aspect('equal')
Looking at the results of performing PCA on the seeds dataset, we can see that it is uniformly distributed over the two dimensions.
This being a smaller dataset, K-means is a fast and efficient method to cluster. Since k-means relies on finding the right cluster points using a centroid until convergence is reached, the clusters are more accurate.
The reason to pick GMM is that it's very useful for probabilistic models of the dataset. It can be used to provide a measure of likelihood of each point belonging to the cluster. But for this dataset, we can see that some of the clusters aren't well formed, so we prefer k-means over the other two methods.
The social network dataset has been taken from the Koblenz Network Collection (Source: http://konect.cc/)
It contains social network data that has been anonymised for analysis with users as nodes (numbered 1 to 2888) and edges being undirected.
Each row is an edge between two nodes of the network.
# No column names in the dataset, so we add them
colnames=['nodes', 'edges']
# Load the dataset `social-network.csv` into `soc_net`
soc_net = pd.read_csv('social-network.csv', names=colnames, header=None)
soc_net.head()
| nodes | edges | |
|---|---|---|
| 0 | 1 | 2 |
| 1 | 1 | 3 |
| 2 | 1 | 4 |
| 3 | 1 | 5 |
| 4 | 1 | 6 |
We describe the numerical statistics such as max nodes and edges and also check if there are any missing/null values to handle.
soc_net.describe()
| nodes | edges | |
|---|---|---|
| count | 2981.000000 | 2981.000000 |
| mean | 970.580342 | 1458.960751 |
| std | 776.901860 | 839.483888 |
| min | 1.000000 | 2.000000 |
| 25% | 288.000000 | 720.000000 |
| 50% | 603.000000 | 1460.000000 |
| 75% | 1525.000000 | 2202.000000 |
| max | 2699.000000 | 2888.000000 |
soc_net.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 2981 entries, 0 to 2980 Data columns (total 2 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 nodes 2981 non-null int64 1 edges 2981 non-null int64 dtypes: int64(2) memory usage: 46.7 KB
Pandas gives us the option to quickly create a graph directly from a dataframe.
# The function `nx.from_pandas_edgelist()` takes `nodes` and `edges` as parameters.
net_graph = nx.from_pandas_edgelist(soc_net, source='nodes', target='edges')
net_graph
<networkx.classes.graph.Graph at 0x7f9862080430>
net_graph.number_of_nodes()
2888
net_graph.number_of_edges()
2981
We can assume 2981 edges for 2888 nodes means some nodes have multiple edges and it will be interesting to explore the visualisation.
net_graph.nodes()
NodeView((1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 1525, 603, 710, 714, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 604, 605, 606, 607, 608, 609, 610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 711, 712, 713, 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753, 754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 2232, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792, 793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805, 806, 807, 808, 809, 810, 811, 812, 813, 814, 815, 816, 817, 818, 819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 844, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 857, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 911, 912, 913, 914, 915, 916, 917, 918, 919, 920, 921, 922, 923, 924, 925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935, 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 957, 958, 959, 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 1020, 1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028, 1029, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039, 1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 1050, 1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092, 1093, 1094, 1095, 1096, 1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127, 1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138, 1139, 1140, 1141, 1142, 1143, 1144, 1145, 1146, 1147, 1148, 1149, 1150, 1151, 1152, 1153, 1154, 1155, 1156, 1157, 1158, 1159, 1160, 1161, 1162, 1163, 1164, 1165, 1166, 1167, 1168, 1169, 1170, 1171, 1172, 1173, 1174, 1175, 1176, 1177, 1178, 1179, 1180, 1181, 1182, 1183, 1184, 1185, 1186, 1187, 1188, 1189, 1190, 1191, 1192, 1193, 1194, 1195, 1196, 1197, 1198, 1199, 1200, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211, 1212, 1213, 1214, 1215, 1216, 1217, 1218, 1219, 1220, 1221, 1222, 1223, 1224, 1225, 1226, 1227, 1228, 1229, 1230, 1231, 1232, 1233, 1234, 1235, 1236, 1237, 1238, 1239, 1240, 1241, 1242, 1243, 1244, 1245, 1246, 1247, 1248, 1249, 1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257, 1258, 1259, 1260, 1261, 1262, 1263, 1264, 1265, 1266, 1267, 1268, 1269, 1270, 1271, 1272, 1273, 1274, 1275, 1276, 1277, 1278, 1279, 1280, 1281, 1282, 1283, 1284, 1285, 1286, 1287, 1288, 1289, 1290, 1291, 1292, 1293, 1294, 1295, 1296, 1297, 1298, 1299, 1300, 1301, 1302, 1303, 1304, 1305, 1306, 1307, 1308, 1309, 1310, 1311, 1312, 1313, 1314, 1315, 1316, 1317, 1318, 1319, 1320, 1321, 1322, 1323, 1324, 1325, 1326, 1327, 1328, 1329, 1330, 1331, 1332, 1333, 1334, 1335, 1336, 1337, 1338, 1339, 1340, 1341, 1342, 1343, 1344, 1345, 1346, 1347, 1348, 1349, 1350, 1351, 1352, 1353, 1354, 1355, 1356, 1357, 1358, 1359, 1360, 1361, 1362, 1363, 1364, 1365, 1366, 1367, 1368, 1369, 1370, 1371, 1372, 1373, 1374, 1375, 1376, 1377, 1378, 1379, 1380, 1381, 1382, 1383, 1384, 1385, 1386, 1387, 1388, 1389, 1390, 1391, 1392, 1393, 1394, 1395, 1396, 1397, 1398, 1399, 1400, 1401, 1402, 1403, 1404, 1405, 1406, 1407, 1408, 1409, 1410, 1411, 1412, 1413, 1414, 1415, 1416, 1417, 1418, 1419, 1420, 1421, 1422, 1423, 1424, 1425, 1426, 1427, 1428, 1429, 1430, 1431, 1432, 1433, 1434, 1435, 1436, 1437, 1438, 1439, 1440, 1441, 1442, 1443, 1444, 1445, 1446, 1447, 1448, 1449, 1450, 1451, 1452, 1453, 1454, 1455, 1456, 1457, 1458, 1459, 1460, 1461, 1462, 1463, 1464, 1465, 1466, 1467, 1468, 1469, 1470, 1471, 1472, 1473, 1474, 1475, 1476, 1477, 1478, 1479, 1480, 1481, 1482, 1483, 1484, 1485, 1486, 1487, 1488, 1489, 1490, 1491, 1492, 1493, 1494, 1495, 1496, 1497, 1498, 1499, 1500, 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508, 1509, 1510, 1511, 1512, 1513, 1514, 1515, 1516, 1517, 1518, 1519, 1520, 1521, 1522, 1523, 1524, 2329, 2330, 2331, 2332, 2333, 2334, 2335, 2336, 2337, 2338, 2339, 2340, 2341, 2342, 2343, 2344, 2345, 2346, 2347, 2348, 2349, 2350, 2351, 2352, 2353, 2354, 2355, 2356, 2357, 2358, 2359, 2360, 2361, 2362, 2363, 2364, 2365, 2366, 2367, 2368, 2369, 2370, 2371, 2372, 2373, 2374, 2375, 2376, 2377, 2378, 2379, 2380, 2381, 2382, 2383, 2384, 2385, 2386, 2387, 2388, 2389, 2390, 2391, 2392, 2393, 2394, 2395, 2396, 2397, 2398, 2399, 2400, 2401, 2402, 2403, 2404, 2405, 2406, 2407, 2408, 2409, 2410, 2411, 2412, 2413, 2414, 2415, 2416, 2417, 2418, 2419, 2420, 2421, 2422, 2423, 2424, 2425, 2426, 2427, 2428, 2429, 2430, 2431, 2432, 2433, 2434, 2435, 2436, 2437, 2438, 2439, 2440, 2441, 2442, 2443, 2444, 2445, 2446, 2447, 2448, 2449, 2450, 2451, 2452, 2453, 2454, 2455, 2456, 2457, 2458, 2459, 2460, 2461, 2462, 2463, 2464, 2465, 2466, 2467, 2468, 2469, 2470, 2471, 2472, 2473, 2474, 2475, 2476, 2477, 2478, 2479, 2480, 2481, 2482, 2483, 2484, 2485, 2486, 2487, 2488, 2489, 2490, 2491, 2492, 2493, 2494, 2495, 2496, 2497, 2498, 2499, 2500, 2501, 2502, 2503, 2504, 2505, 2506, 2507, 2508, 2509, 2510, 2511, 2512, 2513, 2514, 2515, 2516, 2517, 2518, 2519, 2520, 2521, 2522, 2523, 2524, 2525, 2526, 2527, 2528, 2529, 2530, 2531, 2532, 2533, 2534, 2535, 2594, 2595, 2596, 2597, 2598, 2599, 2600, 2601, 2602, 2603, 2604, 2605, 2606, 2607, 2608, 2609, 2610, 2611, 2612, 2613, 2614, 2615, 2616, 2617, 2618, 2619, 2620, 2621, 2622, 2623, 2624, 2625, 2626, 2627, 2628, 2629, 2630, 2631, 2632, 2633, 2634, 2635, 2636, 2637, 2638, 2639, 2640, 2641, 2642, 2643, 2644, 2645, 2646, 2647, 2648, 2649, 2650, 2651, 2652, 2653, 2654, 2655, 2656, 2657, 2658, 2659, 2660, 2661, 2662, 2663, 2664, 2665, 2666, 2667, 2668, 2669, 2670, 2671, 2672, 2673, 2674, 2675, 2676, 2677, 2678, 2679, 2680, 2681, 2682, 2683, 2684, 2685, 2686, 2699, 1526, 1527, 1528, 1529, 1530, 1531, 1532, 1533, 1534, 1535, 1536, 1537, 1538, 1539, 1540, 1541, 1542, 1543, 1544, 1545, 1546, 1547, 1548, 1549, 1550, 1551, 1552, 1553, 1554, 1555, 1556, 1557, 1558, 1559, 1560, 1561, 1562, 1563, 1564, 1565, 1566, 1567, 1568, 1569, 1570, 1571, 1572, 1573, 1574, 1575, 1576, 1577, 1578, 1579, 1580, 1581, 1582, 1583, 1584, 1585, 1586, 1587, 1588, 1589, 1590, 1591, 1592, 1593, 1594, 1595, 1596, 1597, 1598, 1599, 1600, 1601, 1602, 1603, 1604, 1605, 1606, 1607, 1608, 1609, 1610, 1611, 1612, 1613, 1614, 1615, 1616, 1617, 1618, 1619, 1620, 1621, 1622, 1623, 1624, 1625, 1626, 1627, 1628, 1629, 1630, 1631, 1632, 1633, 1634, 1635, 1636, 1637, 1638, 1639, 1640, 1641, 1642, 1643, 1644, 1645, 1646, 1647, 1648, 1649, 1650, 1651, 1652, 1653, 1654, 1655, 1656, 1657, 1658, 1659, 1660, 1661, 1662, 1663, 1664, 1665, 1666, 1667, 1668, 1669, 1670, 1671, 1672, 1673, 1674, 1675, 1676, 1677, 1678, 1679, 1680, 1681, 1682, 1683, 1684, 1685, 1686, 1687, 1688, 1689, 1690, 1691, 1692, 1693, 1694, 1695, 1696, 1697, 1698, 1699, 1700, 1701, 1702, 1703, 1704, 1705, 1706, 1707, 1708, 1709, 1710, 1711, 1712, 1713, 1714, 1715, 1716, 1717, 1718, 1719, 1720, 1721, 1722, 1723, 1724, 1725, 1726, 1727, 1728, 1729, 1730, 1731, 1732, 1733, 1734, 1735, 1736, 1737, 1738, 1739, 1740, 1741, 1742, 1743, 1744, 1745, 1746, 1747, 1748, 1749, 1750, 1751, 1752, 1753, 1754, 1755, 1756, 1757, 1758, 1759, 1760, 1761, 1762, 1763, 1764, 1765, 1766, 1767, 1768, 1769, 1770, 1771, 1772, 1773, 1774, 1775, 1776, 1777, 1778, 1779, 1780, 1781, 1782, 1783, 1784, 1785, 1786, 1787, 1788, 1789, 1790, 1791, 1792, 1793, 1794, 1795, 1796, 1797, 1798, 1799, 1800, 1801, 1802, 1803, 1804, 1805, 1806, 1807, 1808, 1809, 1810, 1811, 1812, 1813, 1814, 1815, 1816, 1817, 1818, 1819, 1820, 1821, 1822, 1823, 1824, 1825, 1826, 1827, 1828, 1829, 1830, 1831, 1832, 1833, 1834, 1835, 1836, 1837, 1838, 1839, 1840, 1841, 1842, 1843, 1844, 1845, 1846, 1847, 1848, 1849, 1850, 1851, 1852, 1853, 1854, 1855, 1856, 1857, 1858, 1859, 1860, 1861, 1862, 1863, 1864, 1865, 1866, 1867, 1868, 1869, 1870, 1871, 1872, 1873, 1874, 1875, 1876, 1877, 1878, 1879, 1880, 1881, 1882, 1883, 1884, 1885, 1886, 1887, 1888, 1889, 1890, 1891, 1892, 1893, 1894, 1895, 1896, 1897, 1898, 1899, 1900, 1901, 1902, 1903, 1904, 1905, 1906, 1907, 1908, 1909, 1910, 1911, 1912, 1913, 1914, 1915, 1916, 1917, 1918, 1919, 1920, 1921, 1922, 1923, 1924, 1925, 1926, 1927, 1928, 1929, 1930, 1931, 1932, 1933, 1934, 1935, 1936, 1937, 1938, 1939, 1940, 1941, 1942, 1943, 1944, 1945, 1946, 1947, 1948, 1949, 1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957, 1958, 1959, 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024, 2025, 2026, 2027, 2028, 2029, 2030, 2031, 2032, 2033, 2034, 2035, 2036, 2037, 2038, 2039, 2040, 2041, 2042, 2043, 2044, 2045, 2046, 2047, 2048, 2049, 2050, 2051, 2052, 2053, 2054, 2055, 2056, 2057, 2058, 2059, 2060, 2061, 2062, 2063, 2064, 2065, 2066, 2067, 2068, 2069, 2070, 2071, 2072, 2073, 2074, 2075, 2076, 2077, 2078, 2079, 2080, 2081, 2082, 2083, 2084, 2085, 2086, 2087, 2088, 2089, 2090, 2091, 2092, 2093, 2094, 2095, 2096, 2097, 2098, 2099, 2100, 2101, 2102, 2103, 2104, 2105, 2106, 2107, 2108, 2109, 2110, 2111, 2112, 2113, 2114, 2115, 2116, 2117, 2118, 2119, 2120, 2121, 2122, 2123, 2124, 2125, 2126, 2127, 2128, 2129, 2130, 2131, 2132, 2133, 2134, 2135, 2136, 2137, 2138, 2139, 2140, 2141, 2142, 2143, 2144, 2145, 2146, 2147, 2148, 2149, 2150, 2151, 2152, 2153, 2154, 2155, 2156, 2157, 2158, 2159, 2160, 2161, 2162, 2163, 2164, 2165, 2166, 2167, 2168, 2169, 2170, 2171, 2172, 2173, 2174, 2175, 2176, 2177, 2178, 2179, 2180, 2181, 2182, 2183, 2184, 2185, 2186, 2187, 2188, 2189, 2190, 2191, 2192, 2193, 2194, 2195, 2196, 2197, 2198, 2199, 2200, 2201, 2202, 2203, 2204, 2205, 2206, 2207, 2208, 2209, 2210, 2211, 2212, 2213, 2214, 2215, 2216, 2217, 2218, 2219, 2220, 2221, 2222, 2223, 2224, 2225, 2226, 2227, 2228, 2229, 2230, 2231, 2233, 2234, 2235, 2236, 2237, 2238, 2239, 2240, 2241, 2242, 2243, 2244, 2245, 2246, 2247, 2248, 2249, 2250, 2251, 2252, 2253, 2254, 2255, 2256, 2257, 2258, 2259, 2260, 2261, 2262, 2263, 2264, 2265, 2266, 2267, 2268, 2269, 2270, 2271, 2272, 2273, 2274, 2275, 2276, 2277, 2278, 2279, 2280, 2281, 2282, 2283, 2284, 2285, 2286, 2287, 2288, 2289, 2290, 2291, 2292, 2293, 2294, 2295, 2296, 2297, 2298, 2299, 2300, 2301, 2302, 2303, 2304, 2305, 2306, 2307, 2308, 2309, 2310, 2311, 2312, 2313, 2314, 2315, 2316, 2317, 2318, 2319, 2320, 2321, 2322, 2323, 2324, 2325, 2326, 2327, 2328, 2536, 2537, 2538, 2539, 2540, 2541, 2542, 2543, 2544, 2545, 2546, 2547, 2548, 2549, 2550, 2551, 2552, 2553, 2554, 2555, 2556, 2557, 2558, 2559, 2560, 2561, 2562, 2563, 2564, 2565, 2566, 2567, 2568, 2569, 2570, 2571, 2572, 2573, 2574, 2575, 2576, 2577, 2578, 2579, 2580, 2581, 2582, 2583, 2584, 2585, 2586, 2587, 2588, 2589, 2590, 2591, 2592, 2593, 2687, 2688, 2689, 2690, 2691, 2692, 2693, 2694, 2695, 2696, 2697, 2698, 2700, 2701, 2702, 2703, 2704, 2705, 2706, 2707, 2708, 2709, 2710, 2711, 2712, 2713, 2714, 2715, 2716, 2717, 2718, 2719, 2720, 2721, 2722, 2723, 2724, 2725, 2726, 2727, 2728, 2729, 2730, 2731, 2732, 2733, 2734, 2735, 2736, 2737, 2738, 2739, 2740, 2741, 2742, 2743, 2744, 2745, 2746, 2747, 2748, 2749, 2750, 2751, 2752, 2753, 2754, 2755, 2756, 2757, 2758, 2759, 2760, 2761, 2762, 2763, 2764, 2765, 2766, 2767, 2768, 2769, 2770, 2771, 2772, 2773, 2774, 2775, 2776, 2777, 2778, 2779, 2780, 2781, 2782, 2783, 2784, 2785, 2786, 2787, 2788, 2789, 2790, 2791, 2792, 2793, 2794, 2795, 2796, 2797, 2798, 2799, 2800, 2801, 2802, 2803, 2804, 2805, 2806, 2807, 2808, 2809, 2810, 2811, 2812, 2813, 2814, 2815, 2816, 2817, 2818, 2819, 2820, 2821, 2822, 2823, 2824, 2825, 2826, 2827, 2828, 2829, 2830, 2831, 2832, 2833, 2834, 2835, 2836, 2837, 2838, 2839, 2840, 2841, 2842, 2843, 2844, 2845, 2846, 2847, 2848, 2849, 2850, 2851, 2852, 2853, 2854, 2855, 2856, 2857, 2858, 2859, 2860, 2861, 2862, 2863, 2864, 2865, 2866, 2867, 2868, 2869, 2870, 2871, 2872, 2873, 2874, 2875, 2876, 2877, 2878, 2879, 2880, 2881, 2882, 2883, 2884, 2885, 2886, 2887, 2888))
net_graph.edges()
EdgeView([(1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (1, 10), (1, 11), (1, 12), (1, 13), (1, 14), (1, 15), (1, 16), (1, 17), (1, 18), (1, 19), (1, 20), (1, 21), (1, 22), (1, 23), (1, 24), (1, 25), (1, 26), (1, 27), (1, 28), (1, 29), (1, 30), (1, 31), (1, 32), (1, 33), (1, 34), (1, 35), (1, 36), (1, 37), (1, 38), (1, 39), (1, 40), (1, 41), (1, 42), (1, 43), (1, 44), (1, 45), (1, 46), (1, 47), (1, 48), (1, 49), (1, 50), (1, 51), (1, 52), (1, 53), (1, 54), (1, 55), (1, 56), (1, 57), (1, 58), (1, 59), (1, 60), (1, 61), (1, 62), (1, 63), (1, 64), (1, 65), (1, 66), (1, 67), (1, 68), (1, 69), (1, 70), (1, 71), (1, 72), (1, 73), (1, 74), (1, 75), (1, 76), (1, 77), (1, 78), (1, 79), (1, 80), (1, 81), (1, 82), (1, 83), (1, 84), (1, 85), (1, 86), (1, 87), (1, 88), (1, 89), (1, 90), (1, 91), (1, 92), (1, 93), (1, 94), (1, 95), (1, 96), (1, 97), (1, 98), (1, 99), (1, 100), (1, 101), (1, 102), (1, 103), (1, 104), (1, 105), (1, 106), (1, 107), (1, 108), (1, 109), (1, 110), (1, 111), (1, 112), (1, 113), (1, 114), (1, 115), (1, 116), (1, 117), (1, 118), (1, 119), (1, 120), (1, 121), (1, 122), (1, 123), (1, 124), (1, 125), (1, 126), (1, 127), (1, 128), (1, 129), (1, 130), (1, 131), (1, 132), (1, 133), (1, 134), (1, 135), (1, 136), (1, 137), (1, 138), (1, 139), (1, 140), (1, 141), (1, 142), (1, 143), (1, 144), (1, 145), (1, 146), (1, 147), (1, 148), (1, 149), (1, 150), (1, 151), (1, 152), (1, 153), (1, 154), (1, 155), (1, 156), (1, 157), (1, 158), (1, 159), (1, 160), (1, 161), (1, 162), (1, 163), (1, 164), (1, 165), (1, 166), (1, 167), (1, 168), (1, 169), (1, 170), (1, 171), (1, 172), (1, 173), (1, 174), (1, 175), (1, 176), (1, 177), (1, 178), (1, 179), (1, 180), (1, 181), (1, 182), (1, 183), (1, 184), (1, 185), (1, 186), (1, 187), (1, 188), (1, 189), (1, 190), (1, 191), (1, 192), (1, 193), (1, 194), (1, 195), (1, 196), (1, 197), (1, 198), (1, 199), (1, 200), (1, 201), (1, 202), (1, 203), (1, 204), (1, 205), (1, 206), (1, 207), (1, 208), (1, 209), (1, 210), (1, 211), (1, 212), (1, 213), (1, 214), (1, 215), (1, 216), (1, 217), (1, 218), (1, 219), (1, 220), (1, 221), (1, 222), (1, 223), (1, 224), (1, 225), (1, 226), (1, 227), (1, 228), (1, 229), (1, 230), (1, 231), (1, 232), (1, 233), (1, 234), (1, 235), (1, 236), (1, 237), (1, 238), (1, 239), (1, 240), (1, 241), (1, 242), (1, 243), (1, 244), (1, 245), (1, 246), (1, 247), (1, 248), (1, 249), (1, 250), (1, 251), (1, 252), (1, 253), (1, 254), (1, 255), (1, 256), (1, 257), (1, 258), (1, 259), (1, 260), (1, 261), (1, 262), (1, 263), (1, 264), (1, 265), (1, 266), (1, 267), (1, 268), (1, 269), (1, 270), (1, 271), (1, 272), (1, 273), (1, 274), (1, 275), (1, 276), (1, 277), (1, 278), (1, 279), (1, 280), (1, 281), (1, 282), (1, 283), (1, 284), (1, 285), (1, 286), (1, 287), (1, 288), (35, 1525), (69, 603), (71, 710), (71, 714), (90, 710), (217, 710), (247, 288), (247, 603), (247, 1525), (288, 289), (288, 290), (288, 291), (288, 292), (288, 293), (288, 294), (288, 295), (288, 296), (288, 297), (288, 298), (288, 299), (288, 300), (288, 301), (288, 302), (288, 303), (288, 304), (288, 305), (288, 306), (288, 307), (288, 308), (288, 309), (288, 310), (288, 311), (288, 312), (288, 313), (288, 314), (288, 315), (288, 316), (288, 317), (288, 318), (288, 319), (288, 320), (288, 321), (288, 322), (288, 323), (288, 324), (288, 325), (288, 326), (288, 327), (288, 328), (288, 329), (288, 330), (288, 331), (288, 332), (288, 333), (288, 334), (288, 335), (288, 336), (288, 337), (288, 338), (288, 339), (288, 340), (288, 341), (288, 342), (288, 343), (288, 344), (288, 345), (288, 346), (288, 347), (288, 348), (288, 349), (288, 350), (288, 351), (288, 352), (288, 353), (288, 354), (288, 355), (288, 356), (288, 357), (288, 358), (288, 359), (288, 360), (288, 361), (288, 362), (288, 363), (288, 364), (288, 365), (288, 366), (288, 367), (288, 368), (288, 369), (288, 370), (288, 371), (288, 372), (288, 373), (288, 374), (288, 375), (288, 376), (288, 377), (288, 378), (288, 379), (288, 380), (288, 381), (288, 382), (288, 383), (288, 384), (288, 385), (288, 386), (288, 387), (288, 388), (288, 389), (288, 390), (288, 391), (288, 392), (288, 393), (288, 394), (288, 395), (288, 396), (288, 397), (288, 398), (288, 399), (288, 400), (288, 401), (288, 402), (288, 403), (288, 404), (288, 405), (288, 406), (288, 407), (288, 408), (288, 409), (288, 410), (288, 411), (288, 412), (288, 413), (288, 414), (288, 415), (288, 416), (288, 417), (288, 418), (288, 419), (288, 420), (288, 421), (288, 422), (288, 423), (288, 424), (288, 425), (288, 426), (288, 427), (288, 428), (288, 429), (288, 430), (288, 431), (288, 432), (288, 433), (288, 434), (288, 435), (288, 436), (288, 437), (288, 438), (288, 439), (288, 440), (288, 441), (288, 442), (288, 443), (288, 444), (288, 445), (288, 446), (288, 447), (288, 448), (288, 449), (288, 450), (288, 451), (288, 452), (288, 453), (288, 454), (288, 455), (288, 456), (288, 457), (288, 458), (288, 459), (288, 460), (288, 461), (288, 462), (288, 463), (288, 464), (288, 465), (288, 466), (288, 467), (288, 468), (288, 469), (288, 470), (288, 471), (288, 472), (288, 473), (288, 474), (288, 475), (288, 476), (288, 477), (288, 478), (288, 479), (288, 480), (288, 481), (288, 482), (288, 483), (288, 484), (288, 485), (288, 486), (288, 487), (288, 488), (288, 489), (288, 490), (288, 491), (288, 492), (288, 493), (288, 494), (288, 495), (288, 496), (288, 497), (288, 498), (288, 499), (288, 500), (288, 501), (288, 502), (288, 503), (288, 504), (288, 505), (288, 506), (288, 507), (288, 508), (288, 509), (288, 510), (288, 511), (288, 512), (288, 513), (288, 514), (288, 515), (288, 516), (288, 517), (288, 518), (288, 519), (288, 520), (288, 521), (288, 522), (288, 523), (288, 524), (288, 525), (288, 526), (288, 527), (288, 528), (288, 529), (288, 530), (288, 531), (288, 532), (288, 533), (288, 534), (288, 535), (288, 536), (288, 537), (288, 538), (288, 539), (288, 540), (288, 541), (288, 542), (288, 543), (288, 544), (288, 545), (288, 546), (288, 547), (288, 548), (288, 549), (288, 550), (288, 551), (288, 552), (288, 553), (288, 554), (288, 555), (288, 556), (288, 557), (288, 558), (288, 559), (288, 560), (288, 561), (288, 562), (288, 563), (288, 564), (288, 565), (288, 566), (288, 567), (288, 568), (288, 569), (288, 570), (288, 571), (288, 572), (288, 573), (288, 574), (288, 575), (288, 576), (288, 577), (288, 578), (288, 579), (288, 580), (288, 581), (288, 582), (288, 583), (288, 584), (288, 585), (288, 586), (288, 587), (288, 588), (288, 589), (288, 590), (288, 591), (288, 592), (288, 593), (288, 594), (288, 595), (288, 596), (288, 597), (288, 598), (288, 599), (288, 600), (288, 601), (288, 602), (288, 603), (288, 604), (288, 605), (288, 606), (288, 607), (288, 608), (288, 609), (288, 610), (288, 611), (288, 612), (288, 613), (288, 614), (288, 615), (288, 616), (288, 617), (288, 618), (288, 619), (288, 620), (288, 621), (288, 622), (288, 623), (288, 624), (288, 625), (288, 626), (288, 627), (288, 628), (288, 629), (288, 630), (288, 631), (288, 632), (288, 633), (288, 634), (288, 635), (288, 636), (288, 637), (288, 638), (288, 639), (288, 640), (288, 641), (288, 642), (288, 643), (288, 644), (288, 645), (288, 646), (288, 647), (288, 648), (288, 649), (288, 650), (288, 651), (288, 652), (288, 653), (288, 654), (288, 655), (288, 656), (288, 657), (288, 658), (288, 659), (288, 660), (288, 661), (288, 662), (288, 663), (288, 664), (288, 665), (288, 666), (288, 667), (288, 668), (288, 669), (288, 670), (288, 671), (288, 672), (288, 673), (288, 674), (288, 675), (288, 676), (288, 677), (288, 678), (288, 679), (288, 680), (288, 681), (288, 682), (288, 683), (288, 684), (288, 685), (288, 686), (288, 687), (288, 688), (288, 689), (288, 690), (288, 691), (288, 692), (288, 693), (288, 694), (288, 695), (288, 696), (288, 697), (288, 698), (288, 699), (288, 700), (288, 701), (288, 702), (288, 703), (288, 704), (288, 705), (288, 706), (288, 707), (288, 708), (288, 709), (288, 710), (288, 711), (288, 712), (288, 713), (288, 714), (288, 715), (288, 716), (288, 717), (288, 718), (288, 719), (288, 720), (288, 721), (288, 722), (288, 723), (288, 724), (288, 725), (288, 726), (288, 727), (288, 728), (288, 729), (288, 730), (288, 731), (288, 732), (288, 733), (288, 734), (288, 735), (288, 736), (288, 737), (288, 738), (288, 739), (288, 740), (288, 741), (288, 742), (288, 743), (288, 744), (288, 745), (288, 746), (288, 747), (288, 748), (288, 749), (288, 750), (288, 751), (288, 752), (288, 753), (288, 754), (288, 755), (288, 756), (288, 757), (288, 758), (288, 759), (288, 760), (288, 761), (288, 762), (288, 763), (288, 764), (288, 765), (288, 766), (288, 767), (1525, 716), (1525, 719), (1525, 1526), (1525, 1527), (1525, 1528), (1525, 1529), (1525, 1530), (1525, 1531), (1525, 1532), (1525, 1533), (1525, 1534), (1525, 1535), (1525, 1536), (1525, 1537), (1525, 1538), (1525, 1539), (1525, 1540), (1525, 1541), (1525, 1542), (1525, 1543), (1525, 1544), (1525, 1545), (1525, 1546), (1525, 1547), (1525, 1548), (1525, 1549), (1525, 1550), (1525, 1551), (1525, 1552), (1525, 1553), (1525, 1554), (1525, 1555), (1525, 1556), (1525, 1557), (1525, 1558), (1525, 1559), (1525, 1560), (1525, 1561), (1525, 1562), (1525, 1563), (1525, 1564), (1525, 1565), (1525, 1566), (1525, 1567), (1525, 1568), (1525, 1569), (1525, 1570), (1525, 1571), (1525, 1572), (1525, 1573), (1525, 1574), (1525, 1575), (1525, 1576), (1525, 1577), (1525, 1578), (1525, 1579), (1525, 1580), (1525, 1581), (1525, 1582), (1525, 1583), (1525, 1584), (1525, 1585), (1525, 1586), (1525, 1587), (1525, 1588), (1525, 1589), (1525, 1590), (1525, 1591), (1525, 1592), (1525, 1593), (1525, 1594), (1525, 1595), (1525, 1596), (1525, 1597), (1525, 1598), (1525, 1599), (1525, 1600), (1525, 1601), (1525, 1602), (1525, 1603), (1525, 1604), (1525, 1605), (1525, 1606), (1525, 1607), (1525, 1608), (1525, 1609), (1525, 1610), (1525, 1611), (1525, 1612), (1525, 1613), (1525, 1614), (1525, 1615), (1525, 1616), (1525, 1617), (1525, 1618), (1525, 1619), (1525, 1620), (1525, 1621), (1525, 1622), (1525, 1623), (1525, 1624), (1525, 1625), (1525, 1626), (1525, 1627), (1525, 1628), (1525, 1629), (1525, 1630), (1525, 1631), (1525, 1632), (1525, 1633), (1525, 1634), (1525, 1635), (1525, 1636), (1525, 1637), (1525, 1638), (1525, 1639), (1525, 1640), (1525, 1641), (1525, 1642), (1525, 1643), (1525, 1644), (1525, 1645), (1525, 1646), (1525, 1647), (1525, 1648), (1525, 1649), (1525, 1650), (1525, 1651), (1525, 1652), (1525, 1653), (1525, 1654), (1525, 1655), (1525, 1656), (1525, 1657), (1525, 1658), (1525, 1659), (1525, 1660), (1525, 1661), (1525, 1662), (1525, 1663), (1525, 1664), (1525, 1665), (1525, 1666), (1525, 1667), (1525, 1668), (1525, 1669), (1525, 1670), (1525, 1671), (1525, 1672), (1525, 1673), (1525, 1674), (1525, 1675), (1525, 1676), (1525, 1677), (1525, 1678), (1525, 1679), (1525, 1680), (1525, 1681), (1525, 1682), (1525, 1683), (1525, 1684), (1525, 1685), (1525, 1686), (1525, 1687), (1525, 1688), (1525, 1689), (1525, 1690), (1525, 1691), (1525, 1692), (1525, 1693), (1525, 1694), (1525, 1695), (1525, 1696), (1525, 1697), (1525, 1698), (1525, 1699), (1525, 1700), (1525, 1701), (1525, 1702), (1525, 1703), (1525, 1704), (1525, 1705), (1525, 1706), (1525, 1707), (1525, 1708), (1525, 1709), (1525, 1710), (1525, 1711), (1525, 1712), (1525, 1713), (1525, 1714), (1525, 1715), (1525, 1716), (1525, 1717), (1525, 1718), (1525, 1719), (1525, 1720), (1525, 1721), (1525, 1722), (1525, 1723), (1525, 1724), (1525, 1725), (1525, 1726), (1525, 1727), (1525, 1728), (1525, 1729), (1525, 1730), (1525, 1731), (1525, 1732), (1525, 1733), (1525, 1734), (1525, 1735), (1525, 1736), (1525, 1737), (1525, 1738), (1525, 1739), (1525, 1740), (1525, 1741), (1525, 1742), (1525, 1743), (1525, 1744), (1525, 1745), (1525, 1746), (1525, 1747), (1525, 1748), (1525, 1749), (1525, 1750), (1525, 1751), (1525, 1752), (1525, 1753), (1525, 1754), (1525, 1755), (1525, 1756), (1525, 1757), (1525, 1758), (1525, 1759), (1525, 1760), (1525, 1761), (1525, 1762), (1525, 1763), (1525, 1764), (1525, 1765), (1525, 1766), (1525, 1767), (1525, 1768), (1525, 1769), (1525, 1770), (1525, 1771), (1525, 1772), (1525, 1773), (1525, 1774), (1525, 1775), (1525, 1776), (1525, 1777), (1525, 1778), (1525, 1779), (1525, 1780), (1525, 1781), (1525, 1782), (1525, 1783), (1525, 1784), (1525, 1785), (1525, 1786), (1525, 1787), (1525, 1788), (1525, 1789), (1525, 1790), (1525, 1791), (1525, 1792), (1525, 1793), (1525, 1794), (1525, 1795), (1525, 1796), (1525, 1797), (1525, 1798), (1525, 1799), (1525, 1800), (1525, 1801), (1525, 1802), (1525, 1803), (1525, 1804), (1525, 1805), (1525, 1806), (1525, 1807), (1525, 1808), (1525, 1809), (1525, 1810), (1525, 1811), (1525, 1812), (1525, 1813), (1525, 1814), (1525, 1815), (1525, 1816), (1525, 1817), (1525, 1818), (1525, 1819), (1525, 1820), (1525, 1821), (1525, 1822), (1525, 1823), (1525, 1824), (1525, 1825), (1525, 1826), (1525, 1827), (1525, 1828), (1525, 1829), (1525, 1830), (1525, 1831), (1525, 1832), (1525, 1833), (1525, 1834), (1525, 1835), (1525, 1836), (1525, 1837), (1525, 1838), (1525, 1839), (1525, 1840), (1525, 1841), (1525, 1842), (1525, 1843), (1525, 1844), (1525, 1845), (1525, 1846), (1525, 1847), (1525, 1848), (1525, 1849), (1525, 1850), (1525, 1851), (1525, 1852), (1525, 1853), (1525, 1854), (1525, 1855), (1525, 1856), (1525, 1857), (1525, 1858), (1525, 1859), (1525, 1860), (1525, 1861), (1525, 1862), (1525, 1863), (1525, 1864), (1525, 1865), (1525, 1866), (1525, 1867), (1525, 1868), (1525, 1869), (1525, 1870), (1525, 1871), (1525, 1872), (1525, 1873), (1525, 1874), (1525, 1875), (1525, 1876), (1525, 1877), (1525, 1878), (1525, 1879), (1525, 1880), (1525, 1881), (1525, 1882), (1525, 1883), (1525, 1884), (1525, 1885), (1525, 1886), (1525, 1887), (1525, 1888), (1525, 1889), (1525, 1890), (1525, 1891), (1525, 1892), (1525, 1893), (1525, 1894), (1525, 1895), (1525, 1896), (1525, 1897), (1525, 1898), (1525, 1899), (1525, 1900), (1525, 1901), (1525, 1902), (1525, 1903), (1525, 1904), (1525, 1905), (1525, 1906), (1525, 1907), (1525, 1908), (1525, 1909), (1525, 1910), (1525, 1911), (1525, 1912), (1525, 1913), (1525, 1914), (1525, 1915), (1525, 1916), (1525, 1917), (1525, 1918), (1525, 1919), (1525, 1920), (1525, 1921), (1525, 1922), (1525, 1923), (1525, 1924), (1525, 1925), (1525, 1926), (1525, 1927), (1525, 1928), (1525, 1929), (1525, 1930), (1525, 1931), (1525, 1932), (1525, 1933), (1525, 1934), (1525, 1935), (1525, 1936), (1525, 1937), (1525, 1938), (1525, 1939), (1525, 1940), (1525, 1941), (1525, 1942), (1525, 1943), (1525, 1944), (1525, 1945), (1525, 1946), (1525, 1947), (1525, 1948), (1525, 1949), (1525, 1950), (1525, 1951), (1525, 1952), (1525, 1953), (1525, 1954), (1525, 1955), (1525, 1956), (1525, 1957), (1525, 1958), (1525, 1959), (1525, 1960), (1525, 1961), (1525, 1962), (1525, 1963), (1525, 1964), (1525, 1965), (1525, 1966), (1525, 1967), (1525, 1968), (1525, 1969), (1525, 1970), (1525, 1971), (1525, 1972), (1525, 1973), (1525, 1974), (1525, 1975), (1525, 1976), (1525, 1977), (1525, 1978), (1525, 1979), (1525, 1980), (1525, 1981), (1525, 1982), (1525, 1983), (1525, 1984), (1525, 1985), (1525, 1986), (1525, 1987), (1525, 1988), (1525, 1989), (1525, 1990), (1525, 1991), (1525, 1992), (1525, 1993), (1525, 1994), (1525, 1995), (1525, 1996), (1525, 1997), (1525, 1998), (1525, 1999), (1525, 2000), (1525, 2001), (1525, 2002), (1525, 2003), (1525, 2004), (1525, 2005), (1525, 2006), (1525, 2007), (1525, 2008), (1525, 2009), (1525, 2010), (1525, 2011), (1525, 2012), (1525, 2013), (1525, 2014), (1525, 2015), (1525, 2016), (1525, 2017), (1525, 2018), (1525, 2019), (1525, 2020), (1525, 2021), (1525, 2022), (1525, 2023), (1525, 2024), (1525, 2025), (1525, 2026), (1525, 2027), (1525, 2028), (1525, 2029), (1525, 2030), (1525, 2031), (1525, 2032), (1525, 2033), (1525, 2034), (1525, 2035), (1525, 2036), (1525, 2037), (1525, 2038), (1525, 2039), (1525, 2040), (1525, 2041), (1525, 2042), (1525, 2043), (1525, 2044), (1525, 2045), (1525, 2046), (1525, 2047), (1525, 2048), (1525, 2049), (1525, 2050), (1525, 2051), (1525, 2052), (1525, 2053), (1525, 2054), (1525, 2055), (1525, 2056), (1525, 2057), (1525, 2058), (1525, 2059), (1525, 2060), (1525, 2061), (1525, 2062), (1525, 2063), (1525, 2064), (1525, 2065), (1525, 2066), (1525, 2067), (1525, 2068), (1525, 2069), (1525, 2070), (1525, 2071), (1525, 2072), (1525, 2073), (1525, 2074), (1525, 2075), (1525, 2076), (1525, 2077), (1525, 2078), (1525, 2079), (1525, 2080), (1525, 2081), (1525, 2082), (1525, 2083), (1525, 2084), (1525, 2085), (1525, 2086), (1525, 2087), (1525, 2088), (1525, 2089), (1525, 2090), (1525, 2091), (1525, 2092), (1525, 2093), (1525, 2094), (1525, 2095), (1525, 2096), (1525, 2097), (1525, 2098), (1525, 2099), (1525, 2100), (1525, 2101), (1525, 2102), (1525, 2103), (1525, 2104), (1525, 2105), (1525, 2106), (1525, 2107), (1525, 2108), (1525, 2109), (1525, 2110), (1525, 2111), (1525, 2112), (1525, 2113), (1525, 2114), (1525, 2115), (1525, 2116), (1525, 2117), (1525, 2118), (1525, 2119), (1525, 2120), (1525, 2121), (1525, 2122), (1525, 2123), (1525, 2124), (1525, 2125), (1525, 2126), (1525, 2127), (1525, 2128), (1525, 2129), (1525, 2130), (1525, 2131), (1525, 2132), (1525, 2133), (1525, 2134), (1525, 2135), (1525, 2136), (1525, 2137), (1525, 2138), (1525, 2139), (1525, 2140), (1525, 2141), (1525, 2142), (1525, 2143), (1525, 2144), (1525, 2145), (1525, 2146), (1525, 2147), (1525, 2148), (1525, 2149), (1525, 2150), (1525, 2151), (1525, 2152), (1525, 2153), (1525, 2154), (1525, 2155), (1525, 2156), (1525, 2157), (1525, 2158), (1525, 2159), (1525, 2160), (1525, 2161), (1525, 2162), (1525, 2163), (1525, 2164), (1525, 2165), (1525, 2166), (1525, 2167), (1525, 2168), (1525, 2169), (1525, 2170), (1525, 2171), (1525, 2172), (1525, 2173), (1525, 2174), (1525, 2175), (1525, 2176), (1525, 2177), (1525, 2178), (1525, 2179), (1525, 2180), (1525, 2181), (1525, 2182), (1525, 2183), (1525, 2184), (1525, 2185), (1525, 2186), (1525, 2187), (1525, 2188), (1525, 2189), (1525, 2190), (1525, 2191), (1525, 2192), (1525, 2193), (1525, 2194), (1525, 2195), (1525, 2196), (1525, 2197), (1525, 2198), (1525, 2199), (1525, 2200), (1525, 2201), (1525, 2202), (1525, 2203), (1525, 2204), (1525, 2205), (1525, 2206), (1525, 2207), (1525, 2208), (1525, 2209), (1525, 2210), (1525, 2211), (1525, 2212), (1525, 2213), (1525, 2214), (1525, 2215), (1525, 2216), (1525, 2217), (1525, 2218), (1525, 2219), (1525, 2220), (1525, 2221), (1525, 2222), (1525, 2223), (1525, 2224), (1525, 2225), (1525, 2226), (1525, 2227), (1525, 2228), (1525, 2229), (1525, 2230), (1525, 2231), (603, 469), (603, 493), (603, 510), (603, 526), (603, 584), (603, 594), (603, 624), (603, 639), (603, 764), (603, 768), (603, 769), (603, 770), (603, 771), (603, 772), (603, 773), (603, 774), (603, 775), (603, 776), (603, 777), (603, 778), (603, 779), (603, 780), (603, 781), (603, 782), (603, 783), (603, 784), (603, 785), (603, 786), (603, 787), (603, 788), (603, 789), (603, 790), (603, 791), (603, 792), (603, 793), (603, 794), (603, 795), (603, 796), (603, 797), (603, 798), (603, 799), (603, 800), (603, 801), (603, 802), (603, 803), (603, 804), (603, 805), (603, 806), (603, 807), (603, 808), (603, 809), (603, 810), (603, 811), (603, 812), (603, 813), (603, 814), (603, 815), (603, 816), (603, 817), (603, 818), (603, 819), (603, 820), (603, 821), (603, 822), (603, 823), (603, 824), (603, 825), (603, 826), (603, 827), (603, 828), (603, 829), (603, 830), (603, 831), (603, 832), (603, 833), (603, 834), (603, 835), (603, 836), (603, 837), (603, 838), (603, 839), (603, 840), (603, 841), (603, 842), (603, 843), (603, 844), (603, 845), (603, 846), (603, 847), (603, 848), (603, 849), (603, 850), (603, 851), (603, 852), (603, 853), (603, 854), (603, 855), (603, 856), (603, 857), (603, 858), (603, 859), (603, 860), (603, 861), (603, 862), (603, 863), (603, 864), (603, 865), (603, 866), (603, 867), (603, 868), (603, 869), (603, 870), (603, 871), (603, 872), (603, 873), (603, 874), (603, 875), (603, 876), (603, 877), (603, 878), (603, 879), (603, 880), (603, 881), (603, 882), (603, 883), (603, 884), (603, 885), (603, 886), (603, 887), (603, 888), (603, 889), (603, 890), (603, 891), (603, 892), (603, 893), (603, 894), (603, 895), (603, 896), (603, 897), (603, 898), (603, 899), (603, 900), (603, 901), (603, 902), (603, 903), (603, 904), (603, 905), (603, 906), (603, 907), (603, 908), (603, 909), (603, 910), (603, 911), (603, 912), (603, 913), (603, 914), (603, 915), (603, 916), (603, 917), (603, 918), (603, 919), (603, 920), (603, 921), (603, 922), (603, 923), (603, 924), (603, 925), (603, 926), (603, 927), (603, 928), (603, 929), (603, 930), (603, 931), (603, 932), (603, 933), (603, 934), (603, 935), (603, 936), (603, 937), (603, 938), (603, 939), (603, 940), (603, 941), (603, 942), (603, 943), (603, 944), (603, 945), (603, 946), (603, 947), (603, 948), (603, 949), (603, 950), (603, 951), (603, 952), (603, 953), (603, 954), (603, 955), (603, 956), (603, 957), (603, 958), (603, 959), (603, 960), (603, 961), (603, 962), (603, 963), (603, 964), (603, 965), (603, 966), (603, 967), (603, 968), (603, 969), (603, 970), (603, 971), (603, 972), (603, 973), (603, 974), (603, 975), (603, 976), (603, 977), (603, 978), (603, 979), (603, 980), (603, 981), (603, 982), (603, 983), (603, 984), (603, 985), (603, 986), (603, 987), (603, 988), (603, 989), (603, 990), (603, 991), (603, 992), (603, 993), (603, 994), (603, 995), (603, 996), (603, 997), (603, 998), (603, 999), (603, 1000), (603, 1001), (603, 1002), (603, 1003), (603, 1004), (603, 1005), (603, 1006), (603, 1007), (603, 1008), (603, 1009), (603, 1010), (603, 1011), (603, 1012), (603, 1013), (603, 1014), (603, 1015), (603, 1016), (603, 1017), (603, 1018), (603, 1019), (603, 1020), (603, 1021), (603, 1022), (603, 1023), (603, 1024), (603, 1025), (603, 1026), (603, 1027), (603, 1028), (603, 1029), (603, 1030), (603, 1031), (603, 1032), (603, 1033), (603, 1034), (603, 1035), (603, 1036), (603, 1037), (603, 1038), (603, 1039), (603, 1040), (603, 1041), (603, 1042), (603, 1043), (603, 1044), (603, 1045), (603, 1046), (603, 1047), (603, 1048), (603, 1049), (603, 1050), (603, 1051), (603, 1052), (603, 1053), (603, 1054), (603, 1055), (603, 1056), (603, 1057), (603, 1058), (603, 1059), (603, 1060), (603, 1061), (603, 1062), (603, 1063), (603, 1064), (603, 1065), (603, 1066), (603, 1067), (603, 1068), (603, 1069), (603, 1070), (603, 1071), (603, 1072), (603, 1073), (603, 1074), (603, 1075), (603, 1076), (603, 1077), (603, 1078), (603, 1079), (603, 1080), (603, 1081), (603, 1082), (603, 1083), (603, 1084), (603, 1085), (603, 1086), (603, 1087), (603, 1088), (603, 1089), (603, 1090), (603, 1091), (603, 1092), (603, 1093), (603, 1094), (603, 1095), (603, 1096), (603, 1097), (603, 1098), (603, 1099), (603, 1100), (603, 1101), (603, 1102), (603, 1103), (603, 1104), (603, 1105), (603, 1106), (603, 1107), (603, 1108), (603, 1109), (603, 1110), (603, 1111), (603, 1112), (603, 1113), (603, 1114), (603, 1115), (603, 1116), (603, 1117), (603, 1118), (603, 1119), (603, 1120), (603, 1121), (603, 1122), (603, 1123), (603, 1124), (603, 1125), (603, 1126), (603, 1127), (603, 1128), (603, 1129), (603, 1130), (603, 1131), (603, 1132), (603, 1133), (603, 1134), (603, 1135), (603, 1136), (603, 1137), (603, 1138), (603, 1139), (603, 1140), (603, 1141), (603, 1142), (603, 1143), (603, 1144), (603, 1145), (603, 1146), (603, 1147), (603, 1148), (603, 1149), (603, 1150), (603, 1151), (603, 1152), (603, 1153), (603, 1154), (603, 1155), (603, 1156), (603, 1157), (603, 1158), (603, 1159), (603, 1160), (603, 1161), (603, 1162), (603, 1163), (603, 1164), (603, 1165), (603, 1166), (603, 1167), (603, 1168), (603, 1169), (603, 1170), (603, 1171), (603, 1172), (603, 1173), (603, 1174), (603, 1175), (603, 1176), (603, 1177), (603, 1178), (603, 1179), (603, 1180), (603, 1181), (603, 1182), (603, 1183), (603, 1184), (603, 1185), (603, 1186), (603, 1187), (603, 1188), (603, 1189), (603, 1190), (603, 1191), (603, 1192), (603, 1193), (603, 1194), (603, 1195), (603, 1196), (603, 1197), (603, 1198), (603, 1199), (603, 1200), (603, 1201), (603, 1202), (603, 1203), (603, 1204), (603, 1205), (603, 1206), (603, 1207), (603, 1208), (603, 1209), (603, 1210), (603, 1211), (603, 1212), (603, 1213), (603, 1214), (603, 1215), (603, 1216), (603, 1217), (603, 1218), (603, 1219), (603, 1220), (603, 1221), (603, 1222), (603, 1223), (603, 1224), (603, 1225), (603, 1226), (603, 1227), (603, 1228), (603, 1229), (603, 1230), (603, 1231), (603, 1232), (603, 1233), (603, 1234), (603, 1235), (603, 1236), (603, 1237), (603, 1238), (603, 1239), (603, 1240), (603, 1241), (603, 1242), (603, 1243), (603, 1244), (603, 1245), (603, 1246), (603, 1247), (603, 1248), (603, 1249), (603, 1250), (603, 1251), (603, 1252), (603, 1253), (603, 1254), (603, 1255), (603, 1256), (603, 1257), (603, 1258), (603, 1259), (603, 1260), (603, 1261), (603, 1262), (603, 1263), (603, 1264), (603, 1265), (603, 1266), (603, 1267), (603, 1268), (603, 1269), (603, 1270), (603, 1271), (603, 1272), (603, 1273), (603, 1274), (603, 1275), (603, 1276), (603, 1277), (603, 1278), (603, 1279), (603, 1280), (603, 1281), (603, 1282), (603, 1283), (603, 1284), (603, 1285), (603, 1286), (603, 1287), (603, 1288), (603, 1289), (603, 1290), (603, 1291), (603, 1292), (603, 1293), (603, 1294), (603, 1295), (603, 1296), (603, 1297), (603, 1298), (603, 1299), (603, 1300), (603, 1301), (603, 1302), (603, 1303), (603, 1304), (603, 1305), (603, 1306), (603, 1307), (603, 1308), (603, 1309), (603, 1310), (603, 1311), (603, 1312), (603, 1313), (603, 1314), (603, 1315), (603, 1316), (603, 1317), (603, 1318), (603, 1319), (603, 1320), (603, 1321), (603, 1322), (603, 1323), (603, 1324), (603, 1325), (603, 1326), (603, 1327), (603, 1328), (603, 1329), (603, 1330), (603, 1331), (603, 1332), (603, 1333), (603, 1334), (603, 1335), (603, 1336), (603, 1337), (603, 1338), (603, 1339), (603, 1340), (603, 1341), (603, 1342), (603, 1343), (603, 1344), (603, 1345), (603, 1346), (603, 1347), (603, 1348), (603, 1349), (603, 1350), (603, 1351), (603, 1352), (603, 1353), (603, 1354), (603, 1355), (603, 1356), (603, 1357), (603, 1358), (603, 1359), (603, 1360), (603, 1361), (603, 1362), (603, 1363), (603, 1364), (603, 1365), (603, 1366), (603, 1367), (603, 1368), (603, 1369), (603, 1370), (603, 1371), (603, 1372), (603, 1373), (603, 1374), (603, 1375), (603, 1376), (603, 1377), (603, 1378), (603, 1379), (603, 1380), (603, 1381), (603, 1382), (603, 1383), (603, 1384), (603, 1385), (603, 1386), (603, 1387), (603, 1388), (603, 1389), (603, 1390), (603, 1391), (603, 1392), (603, 1393), (603, 1394), (603, 1395), (603, 1396), (603, 1397), (603, 1398), (603, 1399), (603, 1400), (603, 1401), (603, 1402), (603, 1403), (603, 1404), (603, 1405), (603, 1406), (603, 1407), (603, 1408), (603, 1409), (603, 1410), (603, 1411), (603, 1412), (603, 1413), (603, 1414), (603, 1415), (603, 1416), (603, 1417), (603, 1418), (603, 1419), (603, 1420), (603, 1421), (603, 1422), (603, 1423), (603, 1424), (603, 1425), (603, 1426), (603, 1427), (603, 1428), (603, 1429), (603, 1430), (603, 1431), (603, 1432), (603, 1433), (603, 1434), (603, 1435), (603, 1436), (603, 1437), (603, 1438), (603, 1439), (603, 1440), (603, 1441), (603, 1442), (603, 1443), (603, 1444), (603, 1445), (603, 1446), (603, 1447), (603, 1448), (603, 1449), (603, 1450), (603, 1451), (603, 1452), (603, 1453), (603, 1454), (603, 1455), (603, 1456), (603, 1457), (603, 1458), (603, 1459), (603, 1460), (603, 1461), (603, 1462), (603, 1463), (603, 1464), (603, 1465), (603, 1466), (603, 1467), (603, 1468), (603, 1469), (603, 1470), (603, 1471), (603, 1472), (603, 1473), (603, 1474), (603, 1475), (603, 1476), (603, 1477), (603, 1478), (603, 1479), (603, 1480), (603, 1481), (603, 1482), (603, 1483), (603, 1484), (603, 1485), (603, 1486), (603, 1487), (603, 1488), (603, 1489), (603, 1490), (603, 1491), (603, 1492), (603, 1493), (603, 1494), (603, 1495), (603, 1496), (603, 1497), (603, 1498), (603, 1499), (603, 1500), (603, 1501), (603, 1502), (603, 1503), (603, 1504), (603, 1505), (603, 1506), (603, 1507), (603, 1508), (603, 1509), (603, 1510), (603, 1511), (603, 1512), (603, 1513), (603, 1514), (603, 1515), (603, 1516), (603, 1517), (603, 1518), (603, 1519), (603, 1520), (603, 1521), (603, 1522), (603, 1523), (603, 1524), (710, 711), (710, 712), (710, 713), (710, 714), (710, 715), (710, 716), (710, 717), (710, 718), (710, 719), (710, 720), (710, 2329), (710, 2330), (710, 2331), (710, 2332), (710, 2333), (710, 2334), (710, 2335), (710, 2336), (710, 2337), (710, 2338), (710, 2339), (710, 2340), (710, 2341), (710, 2342), (710, 2343), (710, 2344), (710, 2345), (710, 2346), (710, 2347), (710, 2348), (710, 2349), (710, 2350), (710, 2351), (710, 2352), (710, 2353), (710, 2354), (710, 2355), (710, 2356), (710, 2357), (710, 2358), (710, 2359), (710, 2360), (710, 2361), (710, 2362), (710, 2363), (710, 2364), (710, 2365), (710, 2366), (710, 2367), (710, 2368), (710, 2369), (710, 2370), (710, 2371), (710, 2372), (710, 2373), (710, 2374), (710, 2375), (710, 2376), (710, 2377), (710, 2378), (710, 2379), (710, 2380), (710, 2381), (710, 2382), (710, 2383), (710, 2384), (710, 2385), (710, 2386), (710, 2387), (710, 2388), (710, 2389), (710, 2390), (710, 2391), (710, 2392), (710, 2393), (710, 2394), (710, 2395), (710, 2396), (710, 2397), (710, 2398), (710, 2399), (710, 2400), (710, 2401), (710, 2402), (710, 2403), (710, 2404), (710, 2405), (710, 2406), (710, 2407), (710, 2408), (710, 2409), (710, 2410), (710, 2411), (710, 2412), (710, 2413), (710, 2414), (710, 2415), (710, 2416), (710, 2417), (710, 2418), (710, 2419), (710, 2420), (710, 2421), (710, 2422), (710, 2423), (710, 2424), (710, 2425), (710, 2426), (710, 2427), (710, 2428), (710, 2429), (710, 2430), (710, 2431), (710, 2432), (710, 2433), (710, 2434), (710, 2435), (710, 2436), (710, 2437), (710, 2438), (710, 2439), (710, 2440), (710, 2441), (710, 2442), (710, 2443), (710, 2444), (710, 2445), (710, 2446), (710, 2447), (710, 2448), (710, 2449), (710, 2450), (710, 2451), (710, 2452), (710, 2453), (710, 2454), (710, 2455), (710, 2456), (710, 2457), (710, 2458), (710, 2459), (710, 2460), (710, 2461), (710, 2462), (710, 2463), (710, 2464), (710, 2465), (710, 2466), (710, 2467), (710, 2468), (710, 2469), (710, 2470), (710, 2471), (710, 2472), (710, 2473), (710, 2474), (710, 2475), (710, 2476), (710, 2477), (710, 2478), (710, 2479), (710, 2480), (710, 2481), (710, 2482), (710, 2483), (710, 2484), (710, 2485), (710, 2486), (710, 2487), (710, 2488), (710, 2489), (710, 2490), (710, 2491), (710, 2492), (710, 2493), (710, 2494), (710, 2495), (710, 2496), (710, 2497), (710, 2498), (710, 2499), (710, 2500), (710, 2501), (710, 2502), (710, 2503), (710, 2504), (710, 2505), (710, 2506), (710, 2507), (710, 2508), (710, 2509), (710, 2510), (710, 2511), (710, 2512), (710, 2513), (710, 2514), (710, 2515), (710, 2516), (710, 2517), (710, 2518), (710, 2519), (710, 2520), (710, 2521), (710, 2522), (710, 2523), (710, 2524), (710, 2525), (710, 2526), (710, 2527), (710, 2528), (710, 2529), (710, 2530), (710, 2531), (710, 2532), (710, 2533), (710, 2534), (710, 2535), (714, 711), (714, 716), (714, 719), (714, 720), (714, 721), (714, 722), (714, 2348), (714, 2351), (714, 2352), (714, 2354), (714, 2356), (714, 2366), (714, 2369), (714, 2370), (714, 2375), (714, 2386), (714, 2394), (714, 2395), (714, 2399), (714, 2402), (714, 2405), (714, 2407), (714, 2409), (714, 2431), (714, 2434), (714, 2444), (714, 2452), (714, 2461), (714, 2465), (714, 2469), (714, 2475), (714, 2482), (714, 2483), (714, 2484), (714, 2492), (714, 2509), (714, 2511), (714, 2518), (714, 2521), (714, 2523), (714, 2524), (714, 2526), (714, 2530), (714, 2594), (714, 2595), (714, 2596), (714, 2597), (714, 2598), (714, 2599), (714, 2600), (714, 2601), (714, 2602), (714, 2603), (714, 2604), (714, 2605), (714, 2606), (714, 2607), (714, 2608), (714, 2609), (714, 2610), (714, 2611), (714, 2612), (714, 2613), (714, 2614), (714, 2615), (714, 2616), (714, 2617), (714, 2618), (714, 2619), (714, 2620), (714, 2621), (714, 2622), (714, 2623), (714, 2624), (714, 2625), (714, 2626), (714, 2627), (714, 2628), (714, 2629), (714, 2630), (714, 2631), (714, 2632), (714, 2633), (714, 2634), (714, 2635), (714, 2636), (714, 2637), (714, 2638), (714, 2639), (714, 2640), (714, 2641), (714, 2642), (714, 2643), (714, 2644), (714, 2645), (714, 2646), (714, 2647), (714, 2648), (714, 2649), (714, 2650), (714, 2651), (714, 2652), (714, 2653), (714, 2654), (714, 2655), (714, 2656), (714, 2657), (714, 2658), (714, 2659), (714, 2660), (714, 2661), (714, 2662), (714, 2663), (714, 2664), (714, 2665), (714, 2666), (714, 2667), (714, 2668), (714, 2669), (714, 2670), (714, 2671), (714, 2672), (714, 2673), (714, 2674), (714, 2675), (714, 2676), (714, 2677), (714, 2678), (714, 2679), (714, 2680), (714, 2681), (714, 2682), (714, 2683), (714, 2684), (714, 2685), (714, 2686), (335, 2232), (2232, 2233), (2232, 2234), (2232, 2235), (2232, 2236), (2232, 2237), (2232, 2238), (2232, 2239), (2232, 2240), (2232, 2241), (2232, 2242), (2232, 2243), (2232, 2244), (2232, 2245), (2232, 2246), (2232, 2247), (2232, 2248), (2232, 2249), (2232, 2250), (2232, 2251), (2232, 2252), (2232, 2253), (2232, 2254), (2232, 2255), (2232, 2256), (2232, 2257), (2232, 2258), (2232, 2259), (2232, 2260), (2232, 2261), (2232, 2262), (2232, 2263), (2232, 2264), (2232, 2265), (2232, 2266), (2232, 2267), (2232, 2268), (2232, 2269), (2232, 2270), (2232, 2271), (2232, 2272), (2232, 2273), (2232, 2274), (2232, 2275), (2232, 2276), (2232, 2277), (2232, 2278), (2232, 2279), (2232, 2280), (2232, 2281), (2232, 2282), (2232, 2283), (2232, 2284), (2232, 2285), (2232, 2286), (2232, 2287), (2232, 2288), (2232, 2289), (2232, 2290), (2232, 2291), (2232, 2292), (2232, 2293), (2232, 2294), (2232, 2295), (2232, 2296), (2232, 2297), (2232, 2298), (2232, 2299), (2232, 2300), (2232, 2301), (2232, 2302), (2232, 2303), (2232, 2304), (2232, 2305), (2232, 2306), (2232, 2307), (2232, 2308), (2232, 2309), (2232, 2310), (2232, 2311), (2232, 2312), (2232, 2313), (2232, 2314), (2232, 2315), (2232, 2316), (2232, 2317), (2232, 2318), (2232, 2319), (2232, 2320), (2232, 2321), (2232, 2322), (2232, 2323), (2232, 2324), (2232, 2325), (2232, 2326), (2232, 2327), (2232, 2328), (1524, 2699), (2594, 2536), (2699, 2687), (2699, 2698), (2699, 2709), (2699, 2714), (2699, 2720), (2699, 2730), (2699, 2746), (2699, 2748), (2699, 2754), (2699, 2773), (2699, 2775), (2699, 2777), (2699, 2801), (2699, 2804), (2699, 2805), (2699, 2806), (2699, 2811), (2699, 2824), (2699, 2826), (2699, 2829), (2699, 2831), (2699, 2841), (2699, 2857), (2699, 2858), (2699, 2859), (2699, 2860), (2699, 2861), (2699, 2862), (2699, 2863), (2699, 2864), (2699, 2865), (2699, 2866), (2699, 2867), (2699, 2868), (2699, 2869), (2699, 2870), (2699, 2871), (2699, 2872), (2699, 2873), (2699, 2874), (2699, 2875), (2699, 2876), (2699, 2877), (2699, 2878), (2699, 2879), (2699, 2880), (2699, 2881), (2699, 2882), (2699, 2883), (2699, 2884), (2699, 2885), (2699, 2886), (2699, 2887), (2699, 2888), (2536, 2537), (2536, 2538), (2536, 2539), (2536, 2540), (2536, 2541), (2536, 2542), (2536, 2543), (2536, 2544), (2536, 2545), (2536, 2546), (2536, 2547), (2536, 2548), (2536, 2549), (2536, 2550), (2536, 2551), (2536, 2552), (2536, 2553), (2536, 2554), (2536, 2555), (2536, 2556), (2536, 2557), (2536, 2558), (2536, 2559), (2536, 2560), (2536, 2561), (2536, 2562), (2536, 2563), (2536, 2564), (2536, 2565), (2536, 2566), (2536, 2567), (2536, 2568), (2536, 2569), (2536, 2570), (2536, 2571), (2536, 2572), (2536, 2573), (2536, 2574), (2536, 2575), (2536, 2576), (2536, 2577), (2536, 2578), (2536, 2579), (2536, 2580), (2536, 2581), (2536, 2582), (2536, 2583), (2536, 2584), (2536, 2585), (2536, 2586), (2536, 2587), (2536, 2588), (2536, 2589), (2536, 2590), (2536, 2591), (2536, 2592), (2536, 2593), (2687, 2688), (2687, 2689), (2687, 2690), (2687, 2691), (2687, 2692), (2687, 2693), (2687, 2694), (2687, 2695), (2687, 2696), (2687, 2697), (2687, 2698), (2687, 2700), (2687, 2701), (2687, 2702), (2687, 2703), (2687, 2704), (2687, 2705), (2687, 2706), (2687, 2707), (2687, 2708), (2687, 2709), (2687, 2710), (2687, 2711), (2687, 2712), (2687, 2713), (2687, 2714), (2687, 2715), (2687, 2716), (2687, 2717), (2687, 2718), (2687, 2719), (2687, 2720), (2687, 2721), (2687, 2722), (2687, 2723), (2687, 2724), (2687, 2725), (2687, 2726), (2687, 2727), (2687, 2728), (2687, 2729), (2687, 2730), (2687, 2731), (2687, 2732), (2687, 2733), (2687, 2734), (2687, 2735), (2687, 2736), (2687, 2737), (2687, 2738), (2687, 2739), (2687, 2740), (2687, 2741), (2687, 2742), (2687, 2743), (2687, 2744), (2687, 2745), (2687, 2746), (2687, 2747), (2687, 2748), (2687, 2749), (2687, 2750), (2687, 2751), (2687, 2752), (2687, 2753), (2687, 2754), (2687, 2755), (2687, 2756), (2687, 2757), (2687, 2758), (2687, 2759), (2687, 2760), (2687, 2761), (2687, 2762), (2687, 2763), (2687, 2764), (2687, 2765), (2687, 2766), (2687, 2767), (2687, 2768), (2687, 2769), (2687, 2770), (2687, 2771), (2687, 2772), (2687, 2773), (2687, 2774), (2687, 2775), (2687, 2776), (2687, 2777), (2687, 2778), (2687, 2779), (2687, 2780), (2687, 2781), (2687, 2782), (2687, 2783), (2687, 2784), (2687, 2785), (2687, 2786), (2687, 2787), (2687, 2788), (2687, 2789), (2687, 2790), (2687, 2791), (2687, 2792), (2687, 2793), (2687, 2794), (2687, 2795), (2687, 2796), (2687, 2797), (2687, 2798), (2687, 2799), (2687, 2800), (2687, 2801), (2687, 2802), (2687, 2803), (2687, 2804), (2687, 2805), (2687, 2806), (2687, 2807), (2687, 2808), (2687, 2809), (2687, 2810), (2687, 2811), (2687, 2812), (2687, 2813), (2687, 2814), (2687, 2815), (2687, 2816), (2687, 2817), (2687, 2818), (2687, 2819), (2687, 2820), (2687, 2821), (2687, 2822), (2687, 2823), (2687, 2824), (2687, 2825), (2687, 2826), (2687, 2827), (2687, 2828), (2687, 2829), (2687, 2830), (2687, 2831), (2687, 2832), (2687, 2833), (2687, 2834), (2687, 2835), (2687, 2836), (2687, 2837), (2687, 2838), (2687, 2839), (2687, 2840), (2687, 2841), (2687, 2842), (2687, 2843), (2687, 2844), (2687, 2845), (2687, 2846), (2687, 2847), (2687, 2848), (2687, 2849), (2687, 2850), (2687, 2851), (2687, 2852), (2687, 2853), (2687, 2854), (2687, 2855), (2687, 2856), (2687, 2857)])
Edges are always shown as a pair on nodes, i.e., one edge represents the connection between two nodes. Nodes are always a single value, as shown above.
# Use `nx.draw_networkx` to visualise graphs
plt.figure(figsize=(10,8))
nx.draw_networkx(net_graph)
The above is a representation of the entire graph, with all the nodes and edges present. The nodes are represented in blue and the dark marks are the edges. We had already seen that some nodes were going to have multiple edges and the above plot confirms that assumption.
# Visualise just the nodes
nx.draw_networkx_nodes(net_graph, pos=nx.spring_layout(net_graph))
<matplotlib.collections.PathCollection at 0x7f986067aaf0>
# Visualise just the edges
nx.draw_networkx_edges(net_graph, pos=nx.kamada_kawai_layout(net_graph))
<matplotlib.collections.LineCollection at 0x7f985988ea60>
For other, specialised methods of drawing a network graph, we can use the below layouts to get a better understanding of how the graph behaves.
G=net_graph
layout = nx.spring_layout(G)
plt.title('Spring Layout of Social Network Graph')
nx.draw(G,pos=layout,node_size=150,alpha=0.5)
G=net_graph
layout = nx.spectral_layout(G)
plt.title('Spectral Layout of Social Network Graph')
nx.draw(G,pos=layout,node_size=150,alpha=0.5)
After experimenting with a few graphs, we can confirm that the two layouts above are the best layouts for our dataset.
We attempted to draw a planar graph but our dataset isn't a planar and an error was triggered.
Kamada Kawai layout was too cluttered and difficult to interpret which is why it was skipped.
| Property | Description |
|---|---|
| Degree | The degree of a node in a network is just the number of edges the node has connected to it. nx.degree(g,n) will return the degree of node n. |
| Connected components | The separate components that make up the network. A component is a set of nodes from which it is possible to reach all other nodes in the set. nx.number_connected_components(g) returns the number of connected components in the graph. |
| Diameter | The largest possible number of edges that must be traversed to travel on the shortest path between two nodes in the network. nx.diameter(g) will return the diameter of graph g. |
| Density | The density of a graph is a measure of how many edges it has relative to the total number of possible edges it could have. nx.density(g) will return the density of graph g. |
| Shortest path | The shortest route along edges in the graph from one node to another. If edges are weighted, the edge weight can be counted as the length of the edge. nx.shortest_path(g, n1, n2) will return the shortest path between nodes n1 and n2. nx.shortest_path(g) returns the shortest path for each edge. |
nx.degree(net_graph, 1525)
710
nx.degree(net_graph, 603)
769
nx.degree(net_graph, 288)
481
nx.number_connected_components(net_graph)
1
nx.diameter(net_graph)
9
nx.density(net_graph)
0.0007150690793671507
nx.shortest_path(net_graph, 1, 2720)
[1, 69, 603, 1524, 2699, 2720]
Degree Centrality measures the importance of a node in a network, solely based on the number of connections coming out from it, higher the connections, higher the degree centrality.
# Source: https://networkx.org/nx-guides/content/exploratory_notebooks/facebook_notebook.html#centrality-measures
degree_centrality = nx.centrality.degree_centrality(net_graph)
(sorted(degree_centrality.items(), key=lambda item: item[1], reverse=True))[:10]
[(603, 0.2663664703844822), (1525, 0.24593003117422932), (288, 0.16660893661240042), (1, 0.09941115344648424), (710, 0.07655005195704884), (2687, 0.05888465535157603), (714, 0.04814686525805335), (2232, 0.03359889158295809), (2536, 0.020090058884655353), (2699, 0.019050917907862834)]
In the above code, we determine the degree centrality of the graph and only display the top ten highest values.
# Source: https://networkx.org/nx-guides/content/exploratory_notebooks/facebook_notebook.html#centrality-measures
(sorted(net_graph.degree, key=lambda item: item[1], reverse=True))[:10]
[(603, 769), (1525, 710), (288, 481), (1, 287), (710, 221), (2687, 170), (714, 139), (2232, 97), (2536, 58), (2699, 55)]
Here, we check the number of neighbours for a particular node and display the top ten.
plt.figure(figsize=(8, 5))
plt.hist(degree_centrality.values(), bins=25)
plt.xticks(ticks=[0, 0.05, 0.1, 0.15, 0.2, 0.25])
plt.title("Degree Centrality Histogram ", fontdict={"size": 15}, loc="center")
plt.xlabel("Degree Centrality", fontdict={"size": 10})
plt.ylabel("Counts", fontdict={"size": 10})
Text(0, 0.5, 'Counts')
pos = pos = nx.spring_layout(net_graph)
node_size = [v * 1000 for v in degree_centrality.values()]
plt.figure(figsize=(15, 8))
nx.draw_networkx(net_graph, pos=pos, node_size=node_size, with_labels=False, width=0.15)
plt.axis("off")
(-0.6573494791984558, 1.060746443271637, -1.0745257794857026, 0.5650426805019378)
The above two plots/graphs give a visual representation of how connected the graph is based on just a few important nodes.
In the case of our social network graph, degree centrality is essentially used to understand how many connections (internet friends) a particular user has. From our graph, we know node 603 has many connections. Therefore, this node has a high degree centrality and has many internet friends. A percentage of 26% is another way to say they are connected to 26% of the network.
Degree distribution is the frequency with which nodes in the network have a degree sequence.
# Use the function `.degree_histogram()`.
# We skip the nodes that have a degree 0
ddist = nx.degree_histogram(net_graph)[1:]
plt.loglog(range(1,len(ddist)+1),ddist,'o')
plt.title('Degree Distribution')
plt.xlabel('Degree')
plt.ylabel('Frequency')
Text(0, 0.5, 'Frequency')
From the above plot, we can determine that most nodes have a degree of 1 and count of nodes with greater degrees decreases as the distribution plateaus.
From our social network graph, we can conclude that most of the nodes (users) have a low percentage of friends, with very few having more connections.
Clustering Coefficient looks at how interconnected the neighbours of a node in a graph are.
The local clustering coefficient of a node is defined as:
$$ C = \frac{2E_N}{k(k-1)} $$
Here, $E_N$ is the total number of edges between neighbours of the node, and $k$ is the number of neighbours.
ccg = nx.clustering(net_graph)
plt.hist(list(ccg.values()),bins='auto')
plt.title('Clustering Coefficient')
plt.xlabel("Clustering Coefficient")
plt.ylabel("Frequency")
Text(0, 0.5, 'Frequency')
We know that most of the nodes are connected to a single node and hence only possess one neighbour. That is the reason for the clustering coefficient here to be 0 as most of the nodes aren't connected to anything else.
In the case of our social network graph, a low clustering coefficient corresponds to a network wherein the connections don't form a tight-knit community.
Betweenness centrality measures how important a particular node is in a network by finding the count of shortest paths taken with that particular node being passed.
Nodes that have high betweenness centrality are ones that usually connect different regions of the network.
betweenness_centrality = nx.betweenness_centrality(net_graph)
(sorted(betweenness_centrality.items(), key=lambda item: item[1], reverse=True))[:10]
[(603, 0.5497065448918781), (288, 0.46612992918844975), (1525, 0.4294450041419194), (247, 0.24124220674273653), (1, 0.1860965105682874), (2699, 0.13099957488596214), (1524, 0.13019147414713747), (710, 0.12724689931998354), (714, 0.11276804568283787), (2687, 0.09928765193746143)]
The above code sorts the betweenness centrality of the network to display the most important nodes in the network.
We construct a plot and a graph below to show how disproportionate the important nodes are.
# Source: https://networkx.org/nx-guides/content/exploratory_notebooks/facebook_notebook.html#centrality-measures
plt.figure(figsize=(15, 8))
plt.hist(betweenness_centrality.values(), bins=100)
plt.xticks(ticks=[0, 0.062, 0.1, 0.128, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7])
plt.title("Betweenness Centrality Histogram ", fontdict={"size": 35}, loc="center")
plt.xlabel("Betweenness Centrality", fontdict={"size": 20})
plt.ylabel("Counts", fontdict={"size": 20})
Text(0, 0.5, 'Counts')
# Source: https://networkx.org/nx-guides/content/exploratory_notebooks/facebook_notebook.html#centrality-measures
pos = pos = nx.spring_layout(net_graph)
node_size = [v * 1000 for v in betweenness_centrality.values()]
plt.figure(figsize=(15, 8))
nx.draw_networkx(net_graph, pos=pos, node_size=node_size, with_labels=False, width=0.15)
plt.axis("off")
(-1.0439851224422454, 0.728183263540268, -1.0939665377140044, 0.9732972919940949)
In a social network, betweenness centrality is used to gauge how well-connected a user is to different groups and high betweenness centrality indicates connection to many different groups and able to facilitate communication and information flow between them.
Someone with a low betweenness centrality may have fewer connections to other groups and may not be as effective at facilitating the above. This is the case for most nodes in our social networks, whereas nodes such as 603, 288, 1525 have the top three values of betweenness centrality.
If we look at the previous section and check the degree of 288, we find it to be 481. An assumption would be to consider this as any other low connected node. But when we look at the betweenness centrality of the same node, it can be seen that it is one of the most important nodes in the network as many shortest paths pass through node 288. Therefore, degree is not the only form of gauging the importance of a node.
Assortativity refers to the tendency of nodes in a network to connect to other nodes that are similar (attributes or characteristics) to them.
In our social network, assortativity refers to nodes (users) that connect to other nodes with similar demographics.
There are two types of assortativity, positive assortativity and negative assortativity. Positive assortativity refers to nodes connecting to other nodes with similar number of connections (same degree), i.e., with similar attributes as in the case of our social network. Negative assortativity, on the other hand, refers to nodes that connect with other dissimilar nodes in terms of degree or attributes.
nx.degree_assortativity_coefficient(net_graph)
-0.6682140067239861
In a network with negative assortativity, nodes with dissimilar characteristics may be more likely to be connected, which could lead to more efficient transmission of information or ideas within the network.
Pandas plotting
Matplotlib
Seaborn
Plotly
Scikit Learn
Scipy
Networkx